END: BX LR
@ Result depends on the specific byte layout, but typically @ one register will collect the even bytes and the other the odd bytes. @ If input is A R G B A R G B... @ Q0 might become: A A A A R R R R (All Alphas and Reds packed) @ Q1 might become: G G G G B B B B (All Greens and Blues packed)
Here is a complete function to interleave two arrays (interleaving uint16_t arrays): armv7 neon zip
Think of the NEON zip instruction like a physical zipper. It takes two separate registers and interleaves their elements.
ZIP (zip) interleaves elements from two source registers into one or two destination registers. It’s the NEON equivalent of a SIMD “zip” operation. END: BX LR @ Result depends on the
There are two main variations based on element size:
@ Step 1: 8-bit Transpose (Usually VTRN.8) - Skipped for brevity It takes two separate registers and interleaves their
@ The registers Q0-Q7 now effectively hold the Columns of the original matrix.