Skip to content

Valgrind

  • Tool to detect
    • memory management bugs
    • threading bugs
    • profile your programs
  • Prepare unoptimized binary with debug symbols during compile time
  • Dynamic-analysis during runtime
  • Binary is executed in virtual machine with JIT
    ⇒ Massive increase in runtime and memory consumption
  • Problems that can be detected by memcheck
    • use of memory after free
    • use of uninitialized values
    • out of bounds memory access
    • memory leaks
  • Potentially many false positives
    ⇒ suppress mechanisms
  • valgrind.org
  • Valgrind User Manual
  • Basic usage

    module add \
        compiler/gnu \
        devel/valgrind
    gcc -ggdb ${SRC} -o ${BINARY}
    valgrind ${BINARY}
    

Example: Valgrind - Use after free

  • Source code use_after_free.c++

    #include <iostream>
    
    int main(int argc, char *argv[]) {
        auto *charArray = new char[10];
        delete [] charArray;
        std::cout << charArray[0] << std::endl;
    }
    
  • Set up valgrind and build environment

    module add \
        compiler/gnu \
        devel/valgrind
    
  • Build application

    c++ -ggdb use_after_free.c++ -o use_after_free
    
  • Examine use_after_free program with valgrind memchecker

    valgrind ./use_after_free
    
    ==3318937== Memcheck, a memory error detector
    ==3318937== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
    ==3318937== Using Valgrind-3.20.0 and LibVEX; rerun with -h for copyright info
    ==3318937== Command: ./use_after_free
    ==3318937== 
    ==3318937== Invalid read of size 1
    ==3318937==    at 0x400840: main (use_after_free.c++:6)
    ==3318937==  Address 0x5bcec80 is 0 bytes inside a block of size 10 free'd
    ==3318937==    at 0x4C3B6FB: operator delete[](void*) (in /.../libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==3318937==    by 0x40083F: main (use_after_free.c++:5)
    ==3318937==  Block was alloc'd at
    ==3318937==    at 0x4C391AF: operator new[](unsigned long) (in /.../libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==3318937==    by 0x400834: main (use_after_free.c++:4)
    ==3318937== 
    
    ==3318937== 
    ==3318937== HEAP SUMMARY:
    ==3318937==     in use at exit: 0 bytes in 0 blocks
    ==3318937==   total heap usage: 3 allocs, 3 frees, 73,738 bytes allocated
    ==3318937== 
    ==3318937== All heap blocks were freed -- no leaks are possible
    ==3318937== 
    ==3318937== For lists of detected and suppressed errors, rerun with: -s
    ==3318937== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
    

Example: Valgrind - Out of bound access

  • Source code out_of_bound_access.c++

    #include <iostream>
    
    int main(int argc, char *argv[]) {
        auto charArray = new char[10];
        charArray[10] = 10;
        delete []charArray;
    }
    
  • Set up valgrind and build environment

    module add \
        compiler/gnu \
        devel/valgrind
    
  • Build application

    c++ -ggdb out_of_bound_access.c++ -o out_of_bound_access
    
  • Examine out_of_bound_access program with valgrind memchecker

    valgrind --leak-check=full --track-origins=yes --show-reachable=yes ./out_of_bound_access
    
    ==17616== Memcheck, a memory error detector
    ==17616== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
    ==17616== Using Valgrind-3.20.0 and LibVEX; rerun with -h for copyright info
    ==17616== Command: ./out_of_bound_access
    ==17616==
    ==17616== Invalid write of size 1
    ==17616==    at 0x10916E: main (out_of_bound_access.c++:5)
    ==17616==  Address 0x4e2208a is 0 bytes after a block of size 10 alloc'd
    ==17616==    at 0x48432E3: operator new[](unsigned long) (vg_replace_malloc.c:652)
    ==17616==    by 0x109161: main (out_of_bound_access.c++:4)
    ==17616==
    ==17616==
    ==17616== HEAP SUMMARY:
    ==17616==     in use at exit: 0 bytes in 0 blocks
    ==17616==   total heap usage: 2 allocs, 2 frees, 73,738 bytes allocated
    ==17616==
    ==17616== All heap blocks were freed -- no leaks are possible
    ==17616==
    ==17616== For lists of detected and suppressed errors, rerun with: -s
    ==17616== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
    

Example: OpenMPI with valgrind support

  • FAQ: Debugging applications in parallel: What kind of errors can Memchecker find?)
  • Set up valgrind and build environment

    module purge
    export VALGRIND=true
    module add \
        compiler/gnu \
        mpi/openmpi \
        devel/valgrind
    
    # or to update environment
    export VALGRIND=true
    module update
    
  • To check if valgrind tool memchecker is available in OpenMPI, simply run this command:

    ompi_info | grep memchecker
    
  • Build rank_league benchmark with debug symbols and low optimization settings

    mpicc -ggdb rank_league.c -o rank_league
    
  • Examine rank_leage benchmark with valgrind memchecker. Use one output file per rank

    mpirun -np 4  bash -c \
        'valgrind \
             --log-file=valgrind.out.${OMPI_COMM_WORLD_RANK} \
             --suppressions="${MPI_ROOT}/share/openmpi/openmpi-valgrind.supp" \
             --suppressions="${MPI_ROOT}/share/pmix/pmix-valgrind.supp" \
             --suppressions="/usr/share/hwloc/hwloc-valgrind.supp" \
             rank_league
        '
    

    ⇒ many false positives
    ⇒ massive increase in runtime

  • valgrind -v command line options show used suppressions files

    ...
    --3322524-- Reading suppressions file: /software/all/mpi/openmpi/4.1_gnu_11_valgrind/share/openmpi/openmpi-valgrind.supp
    --3322524-- Reading suppressions file: /software/all/mpi/openmpi/4.1_gnu_11_valgrind/share/pmix/pmix-valgrind.supp
    --3322524-- Reading suppressions file: /usr/share/hwloc/hwloc-valgrind.supp
    --3322524-- Reading suppressions file: /software/all/devel/valgrind/3.20.0_gnu_11_openmpi_4.1/libexec/valgrind/default.supp
    ...
    

Example: Valgrind with MPI support

  • Set up valgrind environment

    module purge
    export VALGRIND=true
    module add \
        compiler/gnu \
        mpi/openmpi \
        devel/valgrind
    
    # or to update environment
    export VALGRIND=true
    module update
    
  • Build rank_league benchmark with debug symbols and low optimization settings

    mpicc -ggdb rank_league.c -o rank_league
    
  • Examine rank_league benchmark with valgrind MPI wrapper

    mpirun -np 4 bash -c \
        'LD_PRELOAD="${VALGRIND_LIB_DIR}/valgrind/libmpiwrap-amd64-linux.so" \
         MPIWRAP_DEBUG="warn" \
         valgrind \
             --log-file=valgrind.out.${OMPI_COMM_WORLD_RANK} \
             ./rank_league 2>valgrind.stderr.${OMPI_COMM_WORLD_RANK}'
    

    valgrind.stderr.0:

    valgrind MPI wrappers 785165: Active for pid 785165
    valgrind MPI wrappers 785165: Try MPIWRAP_DEBUG=help for possible options
    valgrind MPI wrappers 785165: warning: no wrapper for PMPI_Get_processor_name
    valgrind MPI wrappers 785165: warning: no wrapper for PMPI_Barrier
    

    ⇒ fewer false positives
    ⇒ still massive increase in runtime

  • Valgrind MPI wrapper options

    LD_PRELOAD="${VALGRIND_LIB_DIR}/valgrind/libmpiwrap-amd64-linux.so" \
    MPIWRAP_DEBUG="help" \
    valgrind \
        ./rank_league
    
    Valid options for the MPIWRAP_DEBUG environment variable are:
    
      quiet       be silent except for errors
      verbose     show wrapper entries/exits
      strict      abort the program if a function with no wrapper is used
      warn        give a warning if a function with no wrapper is used
      help        display this message, then exit
      initkludge  debugging hack; do not use
    ...