To resolve this, I modified the truncation machinery to handle partial truncation. This is hard because you may have many indirect blocks involved with many children blocks in different states. An indirect is a filesystem block that does nothing but point to other blocks, like a page table does for memory. It also had to handle ffs's somewhat complex fragment rules as well as zeroing partially empty blocks. It handles all of this now and supports an arbitrary number of partial truncations to the same file without any blocking operations. It always keeps the on disk copy safe while the in memory copy is free to grow again and indeed be truncated again after that. New pointers are not recorded in an indirect until prior truncation completes so there is no ambiguity about what revision of the file the blocks are from. This brings more complexity to fsync() which must now flush all pending truncations to disk before it can return.
The truncation code is a kind of asynchronous state machine that operates on leaf blocks first and then walks backwards up the tree until it reaches the root. This ensures that we always have a valid path to a block in case we crash. Indirects are only freed when all of their children are freed. For partial truncate, the block is zeroed only once those child pointers that need be are freed. Finally when all blocks have been freed the journal space can be reclaimed.
This post can not convey how complex this work was. It may not sound very dramatic or impressive but it truly has been one of the most complex projects I have ever undertaken.