When does fragmentation occur in the CUDA caching allocator? (docs.pytorch.org)
CUDA memory fragmentation occurs when certain allocation patterns prevent the allocator from serving requests, even with available free space, due to its internal implementation.