<pre style='margin:0'>
Christopher Nielsen (mascguy) pushed a commit to branch master
in repository macports-ports.
</pre>
<p><a href="https://github.com/macports/macports-ports/commit/0f0bc622e240b84e0484dbdc5c0b59bc19fde820">https://github.com/macports/macports-ports/commit/0f0bc622e240b84e0484dbdc5c0b59bc19fde820</a></p>
<pre style="white-space: pre; background: #F8F8F8">The following commit(s) were added to refs/heads/master by this push:
<span style='display:block; white-space:pre;color:#404040;'> new 0f0bc622e24 libpixman: update to 0.43.4
</span>0f0bc622e24 is described below
<span style='display:block; white-space:pre;color:#808000;'>commit 0f0bc622e240b84e0484dbdc5c0b59bc19fde820
</span>Author: Christopher Nielsen <mascguy@github.com>
AuthorDate: Sun Apr 14 17:54:32 2024 -0400
<span style='display:block; white-space:pre;color:#404040;'> libpixman: update to 0.43.4
</span><span style='display:block; white-space:pre;color:#404040;'>
</span><span style='display:block; white-space:pre;color:#404040;'> Fixes: https://trac.macports.org/ticket/69219
</span>---
graphics/libpixman/Portfile | 14 +-
graphics/libpixman/files/patch-pixman-arm.diff | 3258 --------------------
.../libpixman/files/patch-pixman-pixman-vmx.c.diff | 14 -
3 files changed, 4 insertions(+), 3282 deletions(-)
<span style='display:block; white-space:pre;color:#808080;'>diff --git a/graphics/libpixman/Portfile b/graphics/libpixman/Portfile
</span><span style='display:block; white-space:pre;color:#808080;'>index a5de48037df..ee7867297eb 100644
</span><span style='display:block; white-space:pre;background:#e0e0ff;'>--- a/graphics/libpixman/Portfile
</span><span style='display:block; white-space:pre;background:#e0e0ff;'>+++ b/graphics/libpixman/Portfile
</span><span style='display:block; white-space:pre;background:#e0e0e0;'>@@ -14,11 +14,11 @@ legacysupport.newest_darwin_requires_legacy 9
</span> name libpixman
conflicts libpixman-devel
set my_name pixman
<span style='display:block; white-space:pre;background:#ffe0e0;'>-version 0.42.2
</span><span style='display:block; white-space:pre;background:#e0ffe0;'>+version 0.43.4
</span> revision 0
<span style='display:block; white-space:pre;background:#ffe0e0;'>-checksums rmd160 282e3f6fc956391df67398a414b083d221ffdf4d \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sha256 5747d2ec498ad0f1594878cc897ef5eb6c29e91c53b899f7f71b506785fc1376 \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- size 652984
</span><span style='display:block; white-space:pre;background:#e0ffe0;'>+checksums rmd160 344d0e77ec49ac2bdf6cb2c5fe7595b44d07dea7 \
</span><span style='display:block; white-space:pre;background:#e0ffe0;'>+ sha256 48d8539f35488d694a2fef3ce17394d1153ed4e71c05d1e621904d574be5df19 \
</span><span style='display:block; white-space:pre;background:#e0ffe0;'>+ size 636900
</span>
categories graphics
maintainers {ryandesign @ryandesign} {mascguy @mascguy}
<span style='display:block; white-space:pre;background:#e0e0e0;'>@@ -40,12 +40,6 @@ long_description libpixman is a generic library for manipulating pixel \
</span> # Disable unexpected download of subprojects
meson.wrap_mode nodownload
<span style='display:block; white-space:pre;background:#ffe0e0;'>-patchfiles-append patch-pixman-pixman-vmx.c.diff
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-# Upstream patch for ARM support. Merged into master; remove for next release.
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-# Source: https://gitlab.freedesktop.org/pixman/pixman/-/merge_requests/71
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-patchfiles-append patch-pixman-arm.diff
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span> # Upstream patch for dylib versioning. Expected to be included in next release.
# See: https://gitlab.freedesktop.org/pixman/pixman/-/issues/81
patchfiles-append patch-meson-dylib-versions.diff
<span style='display:block; white-space:pre;color:#808080;'>diff --git a/graphics/libpixman/files/patch-pixman-arm.diff b/graphics/libpixman/files/patch-pixman-arm.diff
</span>deleted file mode 100644
<span style='display:block; white-space:pre;color:#808080;'>index f30ea69cd0c..00000000000
</span><span style='display:block; white-space:pre;background:#e0e0ff;'>--- a/graphics/libpixman/files/patch-pixman-arm.diff
</span><span style='display:block; white-space:pre;background:#e0e0ff;'>+++ /dev/null
</span><span style='display:block; white-space:pre;background:#e0e0e0;'>@@ -1,3258 +0,0 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-#==================================================================================================
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-# Upstream patch for ARM support
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-#
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-# Source: https://gitlab.freedesktop.org/pixman/pixman/-/merge_requests/71
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-#==================================================================================================
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>---- meson.build
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+++ meson.build
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -243,6 +243,34 @@ if not use_vmx.disabled()
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+if cc.compiles('''
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ __asm__ (
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ".func meson_test"
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ".endfunc"
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ );''',
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ name : 'test for ASM .func directive')
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ config.set('ASM_HAVE_FUNC_DIRECTIVE', 1)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+if cc.links('''
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ #include <stdint.h>
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ __asm__ (
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ " .global _testlabel\n"
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ "_testlabel:\n"
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ );
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ int testlabel();
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ int main(int argc, char* argv[]) {
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ return testlabel();
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ }''',
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ name : 'test for ASM leading underscore')
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ config.set('ASM_LEADING_UNDERSCORE', 1)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- if have_vmx
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- config.set10('USE_VMX', true)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- elif use_vmx.enabled()
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>---- pixman/pixman-arm-asm.h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+++ pixman/pixman-arm-asm.h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -25,13 +25,33 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- *
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+#include "config.h"
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* Supplementary macro for setting function attributes */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.macro pixman_asm_function fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .func fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .global fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.macro pixman_asm_function_impl fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+#ifdef ASM_HAVE_FUNC_DIRECTIVE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .func \fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+#endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .global \fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- #ifdef __ELF__
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .hidden fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .type fname, %function
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .hidden \fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .type \fname, %function
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+#endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+\fname:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.macro pixman_asm_function fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+#ifdef ASM_LEADING_UNDERSCORE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixman_asm_function_impl _\fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+#else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixman_asm_function_impl \fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+#endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.macro pixman_end_asm_function
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+#ifdef ASM_HAVE_FUNC_DIRECTIVE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .endfunc
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- #endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--fname:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>---- pixman/pixman-arma64-neon-asm-bilinear.S
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+++ pixman/pixman-arma64-neon-asm-bilinear.S
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -77,50 +77,50 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr WTMP1, X, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add X, X, UX
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add TMP1, TOP, TMP1, lsl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {®1&.2s}, [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {®2&.2s}, [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\reg1\().2s}, [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\reg2\().2s}, [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_load_0565 reg1, reg2, tmp
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr WTMP1, X, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add X, X, UX
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add TMP1, TOP, TMP1, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {®2&.s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {®2&.s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- convert_four_0565_to_x888_packed reg2, reg1, reg2, tmp
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\reg2\().s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\reg2\().s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ convert_four_0565_to_x888_packed \reg2, \reg1, \reg2, \tmp
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_load_and_vertical_interpolate_two_8888 \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- acc1, acc2, reg1, reg2, reg3, reg4, tmp1, tmp2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_8888 reg1, reg2, tmp1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &acc1&.8h, ®1&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &acc1&.8h, ®2&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_8888 reg3, reg4, tmp2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &acc2&.8h, ®3&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &acc2&.8h, ®4&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_8888 \reg1, \reg2, \tmp1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\acc1\().8h, \()\reg1\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\acc1\().8h, \()\reg2\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_8888 \reg3, \reg4, \tmp2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\acc2\().8h, \()\reg3\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\acc2\().8h, \()\reg4\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_load_and_vertical_interpolate_four_8888 \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- xacc1, xacc2, xreg1, xreg2, xreg3, xreg4, xacc2lo, xacc2hi \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ xacc1, xacc2, xreg1, xreg2, xreg3, xreg4, xacc2lo, xacc2hi, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- yacc1, yacc2, yreg1, yreg2, yreg3, yreg4, yacc2lo, yacc2hi
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bilinear_load_and_vertical_interpolate_two_8888 \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- xacc1, xacc2, xreg1, xreg2, xreg3, xreg4, xacc2lo, xacc2hi
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \xacc1, \xacc2, \xreg1, \xreg2, \xreg3, \xreg4, \xacc2lo, xacc2hi
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bilinear_load_and_vertical_interpolate_two_8888 \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- yacc1, yacc2, yreg1, yreg2, yreg3, yreg4, yacc2lo, yacc2hi
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \yacc1, \yacc2, \yreg1, \yreg2, \yreg3, \yreg4, \yacc2lo, \yacc2hi
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro vzip reg1, reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- zip1 v24.8b, reg1, reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- zip2 reg2, reg1, reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov reg1, v24.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ zip1 v24.8b, \reg1, \reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ zip2 \reg2, \reg1, \reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \reg1, v24.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro vuzp reg1, reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- uzp1 v24.8b, reg1, reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- uzp2 reg2, reg1, reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov reg1, v24.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ uzp1 v24.8b, \reg1, \reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ uzp2 \reg2, \reg1, \reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \reg1, v24.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_load_and_vertical_interpolate_two_0565 \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -131,23 +131,23 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr WTMP2, X, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add X, X, UX
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add TMP2, TOP, TMP2, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&acc2&.s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&acc2&.s}[2], [TMP2], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&acc2&.s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&acc2&.s}[3], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- convert_0565_to_x888 acc2, reg3, reg2, reg1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip ®1&.8b, ®3&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip ®2&.8b, ®4&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip ®3&.8b, ®4&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip ®1&.8b, ®2&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &acc1&.8h, ®1&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &acc1&.8h, ®2&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &acc2&.8h, ®3&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &acc2&.8h, ®4&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\acc2\().s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\acc2\().s}[2], [TMP2], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\acc2\().s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\acc2\().s}[3], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ convert_0565_to_x888 \acc2, \reg3, \reg2, \reg1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\reg1\().8b, \()\reg3\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\reg2\().8b, \()\reg4\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\reg3\().8b, \()\reg4\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\reg1\().8b, \()\reg2\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\acc1\().8h, \()\reg1\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\acc1\().8h, \()\reg2\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\acc2\().8h, \()\reg3\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\acc2\().8h, \()\reg4\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_load_and_vertical_interpolate_four_0565 \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- xacc1, xacc2, xreg1, xreg2, xreg3, xreg4, xacc2lo, xacc2hi \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ xacc1, xacc2, xreg1, xreg2, xreg3, xreg4, xacc2lo, xacc2hi, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- yacc1, yacc2, yreg1, yreg2, yreg3, yreg4, yacc2lo, yacc2hi
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr WTMP1, X, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -156,49 +156,49 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr WTMP2, X, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add X, X, UX
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add TMP2, TOP, TMP2, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&xacc2&.s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&xacc2&.s}[2], [TMP2], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&xacc2&.s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&xacc2&.s}[3], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- convert_0565_to_x888 xacc2, xreg3, xreg2, xreg1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\xacc2\().s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\xacc2\().s}[2], [TMP2], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\xacc2\().s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\xacc2\().s}[3], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ convert_0565_to_x888 \xacc2, \xreg3, \xreg2, \xreg1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr WTMP1, X, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add X, X, UX
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add TMP1, TOP, TMP1, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr WTMP2, X, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add X, X, UX
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add TMP2, TOP, TMP2, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&yacc2&.s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &xreg1&.8b, &xreg3&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&yacc2&.s}[2], [TMP2], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &xreg2&.8b, &xreg4&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&yacc2&.s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &xreg3&.8b, &xreg4&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&yacc2&.s}[3], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &xreg1&.8b, &xreg2&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- convert_0565_to_x888 yacc2, yreg3, yreg2, yreg1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &xacc1&.8h, &xreg1&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &yreg1&.8b, &yreg3&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &xacc1&.8h, &xreg2&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &yreg2&.8b, &yreg4&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &xacc2&.8h, &xreg3&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &yreg3&.8b, &yreg4&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &xacc2&.8h, &xreg4&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &yreg1&.8b, &yreg2&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &yacc1&.8h, &yreg1&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &yacc1&.8h, &yreg2&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &yacc2&.8h, &yreg3&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &yacc2&.8h, &yreg4&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\yacc2\().s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\xreg1\().8b, \()\xreg3\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\yacc2\().s}[2], [TMP2], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\xreg2\().8b, \()\xreg4\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\yacc2\().s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\xreg3\().8b, \()\xreg4\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\yacc2\().s}[3], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\xreg1\().8b, \()\xreg2\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ convert_0565_to_x888 \yacc2, \yreg3, \yreg2, \yreg1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\xacc1\().8h, \()\xreg1\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\yreg1\().8b, \()\yreg3\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\xacc1\().8h, \()\xreg2\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\yreg2\().8b, \()\yreg4\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\xacc2\().8h, \()\xreg3\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\yreg3\().8b, \()\yreg4\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\xacc2\().8h, \()\xreg4\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\yreg1\().8b, \()\yreg2\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\yacc1\().8h, \()\yreg1\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\yacc1\().8h, \()\yreg2\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\yacc2\().8h, \()\yreg3\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\yacc2\().8h, \()\yreg4\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_store_8888 numpix, tmp1, tmp2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if numpix == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \numpix == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v0.2s, v1.2s}, [OUT], #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v0.2s}, [OUT], #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v0.s}[0], [OUT], #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .error bilinear_store_8888 numpix is unsupported
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .error bilinear_store_8888 \numpix is unsupported
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -207,15 +207,15 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- vuzp v2.8b, v3.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- vuzp v1.8b, v3.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- vuzp v0.8b, v2.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- convert_8888_to_0565 v2, v1, v0, v1, tmp1, tmp2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if numpix == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ convert_8888_to_0565 v2, v1, v0, v1, \tmp1, \tmp2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \numpix == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v1.4h}, [OUT], #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v1.s}[0], [OUT], #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v1.h}[0], [OUT], #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .error bilinear_store_0565 numpix is unsupported
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .error bilinear_store_0565 \numpix is unsupported
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -228,20 +228,20 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_load_mask_8 numpix, mask
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if numpix == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&mask&.s}[0], [MASK], #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&mask&.h}[0], [MASK], #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&mask&.b}[0], [MASK], #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \numpix == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\mask\().s}[0], [MASK], #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\mask\().h}[0], [MASK], #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\mask\().b}[0], [MASK], #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .error bilinear_load_mask_8 numpix is unsupported
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .error bilinear_load_mask_8 \numpix is unsupported
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- prfm PREFETCH_MODE, [MASK, #prefetch_offset]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ prfum PREFETCH_MODE, [MASK, #(prefetch_offset)]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_load_mask mask_fmt, numpix, mask
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_mask_&mask_fmt numpix, mask
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_mask_\mask_fmt \numpix, \mask
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -256,30 +256,30 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_load_dst_8888 numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if numpix == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&dst0&.2s, &dst1&.2s}, [OUT]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&dst0&.2s}, [OUT]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&dst0&.s}[0], [OUT]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \numpix == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\dst0\().2s, \()\dst1\().2s}, [OUT]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\dst0\().2s}, [OUT]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\dst0\().s}[0], [OUT]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .error bilinear_load_dst_8888 numpix is unsupported
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .error bilinear_load_dst_8888 \numpix is unsupported
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &dst01&.d[0], &dst0&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &dst01&.d[1], &dst1&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\dst01\().d[0], \()\dst0\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\dst01\().d[1], \()\dst1\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- prfm PREFETCH_MODE, [OUT, #(prefetch_offset * 4)]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_load_dst_8888_over numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_dst_8888 numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_dst_8888 \numpix, \dst0, \dst1, \dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_load_dst_8888_add numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_dst_8888 numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_dst_8888 \numpix, \dst0, \dst1, \dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_load_dst dst_fmt, op, numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_dst_&dst_fmt&_&op numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_dst_\()\dst_fmt\()_\()\op \numpix, \dst0, \dst1, \dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /*
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -298,19 +298,19 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_duplicate_mask_8 numpix, mask
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if numpix == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- dup &mask&.2s, &mask&.s[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- dup &mask&.4h, &mask&.h[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- dup &mask&.8b, &mask&.b[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \numpix == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ dup \()\mask\().2s, \()\mask\().s[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ dup \()\mask\().4h, \()\mask\().h[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ dup \()\mask\().8b, \()\mask\().b[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .error bilinear_duplicate_mask_8 is unsupported
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .error bilinear_duplicate_\mask_8 is unsupported
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_duplicate_mask mask_fmt, numpix, mask
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_duplicate_mask_&mask_fmt numpix, mask
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_duplicate_mask_\()\mask_fmt \numpix, \mask
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /*
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -318,14 +318,14 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- * Interleave should be done when maks is enabled or operator is 'over'.
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interleave src0, src1, src01, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vuzp &src0&.8b, &src1&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vuzp &dst0&.8b, &dst1&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vuzp &src0&.8b, &src1&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vuzp &dst0&.8b, &dst1&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &src01&.d[1], &src1&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &src01&.d[0], &src0&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &dst01&.d[1], &dst1&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &dst01&.d[0], &dst0&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vuzp \()\src0\().8b, \()\src1\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vuzp \()\dst0\().8b, \()\dst1\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vuzp \()\src0\().8b, \()\src1\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vuzp \()\dst0\().8b, \()\dst1\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\src01\().d[1], \()\src1\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\src01\().d[0], \()\src0\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\dst01\().d[1], \()\dst1\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\dst01\().d[0], \()\dst0\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interleave_src_dst_x_src \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -335,37 +335,38 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interleave_src_dst_x_over \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- numpix, src0, src1, src01, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interleave src0, src1, src01, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interleave \src0, \src1, \src01, \dst0, \dst1, \dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interleave_src_dst_x_add \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- numpix, src0, src1, src01, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interleave src0, src1, src01, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interleave \src0, \src1, \src01, \dst0, \dst1, \dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interleave_src_dst_8_src \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- numpix, src0, src1, src01, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interleave src0, src1, src01, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interleave \src0, \src1, \src01, \dst0, \dst1, \dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interleave_src_dst_8_over \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- numpix, src0, src1, src01, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interleave src0, src1, src01, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interleave \src0, \src1, \src01, \dst0, \dst1, \dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interleave_src_dst_8_add \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- numpix, src0, src1, src01, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interleave src0, src1, src01, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interleave \src0, \src1, \src01, \dst0, \dst1, \dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interleave_src_dst \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mask_fmt, op, numpix, src0, src1, src01, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interleave_src_dst_&mask_fmt&_&op \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- numpix, src0, src1, src01, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interleave_src_dst_\()\mask_fmt\()_\()\op \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \numpix, \src0, \src1, \src01, \dst0, \dst1, \dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -383,25 +384,25 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- numpix, src0, src1, src01, mask, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- tmp01, tmp23, tmp45, tmp67
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &tmp01&.8h, &src0&.8b, &mask&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &tmp23&.8h, &src1&.8b, &mask&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\tmp01\().8h, \()\src0\().8b, \()\mask\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\tmp23\().8h, \()\src1\().8b, \()\mask\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* bubbles */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- urshr &tmp45&.8h, &tmp01&.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- urshr &tmp67&.8h, &tmp23&.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ urshr \()\tmp45\().8h, \()\tmp01\().8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ urshr \()\tmp67\().8h, \()\tmp23\().8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* bubbles */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- raddhn &src0&.8b, &tmp45&.8h, &tmp01&.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- raddhn &src1&.8b, &tmp67&.8h, &tmp23&.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &src01&.d[0], &src0&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &src01&.d[1], &src1&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ raddhn \()\src0\().8b, \()\tmp45\().8h, \()\tmp01\().8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ raddhn \()\src1\().8b, \()\tmp67\().8h, \()\tmp23\().8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\src01\().d[0], \()\src0\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\src01\().d[1], \()\src1\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_apply_mask_to_src \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mask_fmt, numpix, src0, src1, src01, mask, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- tmp01, tmp23, tmp45, tmp67
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_apply_mask_to_src_&mask_fmt \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- numpix, src0, src1, src01, mask, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tmp01, tmp23, tmp45, tmp67
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_apply_mask_to_src_\()\mask_fmt \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \numpix, \src0, \src1, \src01, \mask, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \tmp01, \tmp23, \tmp45, \tmp67
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -418,90 +419,90 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- numpix, src0, src1, src01, dst0, dst1, dst01, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- tmp01, tmp23, tmp45, tmp67, tmp8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- dup &tmp8&.2s, &src1&.s[1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ dup \()\tmp8\().2s, \()\src1\().s[1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* bubbles */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mvn &tmp8&.8b, &tmp8&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mvn \()\tmp8\().8b, \()\tmp8\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* bubbles */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &tmp01&.8h, &dst0&.8b, &tmp8&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\tmp01\().8h, \()\dst0\().8b, \()\tmp8\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* bubbles */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &tmp23&.8h, &dst1&.8b, &tmp8&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\tmp23\().8h, \()\dst1\().8b, \()\tmp8\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* bubbles */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- urshr &tmp45&.8h, &tmp01&.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- urshr &tmp67&.8h, &tmp23&.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ urshr \()\tmp45\().8h, \()\tmp01\().8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ urshr \()\tmp67\().8h, \()\tmp23\().8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* bubbles */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- raddhn &dst0&.8b, &tmp45&.8h, &tmp01&.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- raddhn &dst1&.8b, &tmp67&.8h, &tmp23&.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &dst01&.d[0], &dst0&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &dst01&.d[1], &dst1&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ raddhn \()\dst0\().8b, \()\tmp45\().8h, \()\tmp01\().8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ raddhn \()\dst1\().8b, \()\tmp67\().8h, \()\tmp23\().8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\dst01\().d[0], \()\dst0\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\dst01\().d[1], \()\dst1\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* bubbles */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- uqadd &src0&.8b, &dst0&.8b, &src0&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- uqadd &src1&.8b, &dst1&.8b, &src1&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &src01&.d[0], &src0&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &src01&.d[1], &src1&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ uqadd \()\src0\().8b, \()\dst0\().8b, \()\src0\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ uqadd \()\src1\().8b, \()\dst1\().8b, \()\src1\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\src01\().d[0], \()\src0\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\src01\().d[1], \()\src1\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_combine_add \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- numpix, src0, src1, src01, dst0, dst1, dst01, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- tmp01, tmp23, tmp45, tmp67, tmp8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- uqadd &src0&.8b, &dst0&.8b, &src0&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- uqadd &src1&.8b, &dst1&.8b, &src1&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &src01&.d[0], &src0&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &src01&.d[1], &src1&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ uqadd \()\src0\().8b, \()\dst0\().8b, \()\src0\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ uqadd \()\src1\().8b, \()\dst1\().8b, \()\src1\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\src01\().d[0], \()\src0\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\src01\().d[1], \()\src1\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_combine \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- op, numpix, src0, src1, src01, dst0, dst1, dst01, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- tmp01, tmp23, tmp45, tmp67, tmp8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_combine_&op \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- numpix, src0, src1, src01, dst0, dst1, dst01, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tmp01, tmp23, tmp45, tmp67, tmp8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_combine_\()\op \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \numpix, \src0, \src1, \src01, \dst0, \dst1, \dst01, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \tmp01, \tmp23, \tmp45, \tmp67, \tmp8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /*
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- * Macros for final deinterleaving of destination pixels if needed.
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_deinterleave numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vuzp &dst0&.8b, &dst1&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vuzp \()\dst0\().8b, \()\dst1\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* bubbles */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vuzp &dst0&.8b, &dst1&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &dst01&.d[0], &dst0&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &dst01&.d[1], &dst1&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vuzp \()\dst0\().8b, \()\dst1\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\dst01\().d[0], \()\dst0\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\dst01\().d[1], \()\dst1\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_deinterleave_dst_x_src numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_deinterleave_dst_x_over numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_deinterleave numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_deinterleave \numpix, \dst0, \dst1, \dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_deinterleave_dst_x_add numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_deinterleave numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_deinterleave \numpix, \dst0, \dst1, \dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_deinterleave_dst_8_src numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_deinterleave numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_deinterleave \numpix, \dst0, \dst1, \dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_deinterleave_dst_8_over numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_deinterleave numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_deinterleave \numpix, \dst0, \dst1, \dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_deinterleave_dst_8_add numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_deinterleave numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_deinterleave \numpix, \dst0, \dst1, \dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_deinterleave_dst mask_fmt, op, numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_deinterleave_dst_&mask_fmt&_&op numpix, dst0, dst1, dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_deinterleave_dst_\()\mask_fmt\()_\()\op \numpix, \dst0, \dst1, \dst01
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interpolate_last_pixel src_fmt, mask_fmt, dst_fmt, op
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_&src_fmt v0, v1, v2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_mask mask_fmt, 1, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_dst dst_fmt, op, 1, v18, v19, v9
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_\()\src_fmt v0, v1, v2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_mask \mask_fmt, 1, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_dst \dst_fmt, \op, 1, v18, v19, v9
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v2.8h, v0.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umlal v2.8h, v1.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* 5 cycles bubble */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -509,28 +510,28 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umlsl v0.4s, v2.4h, v15.h[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umlal2 v0.4s, v2.8h, v15.h[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* 5 cycles bubble */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_duplicate_mask mask_fmt, 1, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_duplicate_mask \mask_fmt, 1, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- shrn v0.4h, v0.4s, #(2 * BILINEAR_INTERPOLATION_BITS)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* 3 cycles bubble */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- xtn v0.8b, v0.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* 1 cycle bubble */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bilinear_interleave_src_dst \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mask_fmt, op, 1, v0, v1, v0, v18, v19, v9
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \mask_fmt, \op, 1, v0, v1, v0, v18, v19, v9
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bilinear_apply_mask_to_src \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mask_fmt, 1, v0, v1, v0, v4, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \mask_fmt, 1, v0, v1, v0, v4, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- v3, v8, v10, v11
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bilinear_combine \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- op, 1, v0, v1, v0, v18, v19, v9, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \op, 1, v0, v1, v0, v18, v19, v9, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- v3, v8, v10, v11, v5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_deinterleave_dst mask_fmt, op, 1, v0, v1, v0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_store_&dst_fmt 1, v17, v18
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_deinterleave_dst \mask_fmt, \op, 1, v0, v1, v0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_store_\()\dst_fmt 1, v17, v18
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interpolate_two_pixels src_fmt, mask_fmt, dst_fmt, op
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_and_vertical_interpolate_two_&src_fmt \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_and_vertical_interpolate_two_\()\src_fmt \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- v1, v11, v18, v19, v20, v21, v22, v23
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_mask mask_fmt, 2, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_dst dst_fmt, op, 2, v18, v19, v9
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_mask \mask_fmt, 2, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_dst \dst_fmt, \op, 2, v18, v19, v9
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ushll v0.4s, v1.4h, #BILINEAR_INTERPOLATION_BITS
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umlsl v0.4s, v1.4h, v15.h[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umlal2 v0.4s, v1.8h, v15.h[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -539,25 +540,25 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umlal2 v10.4s, v11.8h, v15.h[4]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- shrn v0.4h, v0.4s, #(2 * BILINEAR_INTERPOLATION_BITS)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- shrn2 v0.8h, v10.4s, #(2 * BILINEAR_INTERPOLATION_BITS)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_duplicate_mask mask_fmt, 2, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_duplicate_mask \mask_fmt, 2, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ushr v15.8h, v12.8h, #(16 - BILINEAR_INTERPOLATION_BITS)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add v12.8h, v12.8h, v13.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- xtn v0.8b, v0.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bilinear_interleave_src_dst \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mask_fmt, op, 2, v0, v1, v0, v18, v19, v9
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \mask_fmt, \op, 2, v0, v1, v0, v18, v19, v9
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bilinear_apply_mask_to_src \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mask_fmt, 2, v0, v1, v0, v4, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \mask_fmt, 2, v0, v1, v0, v4, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- v3, v8, v10, v11
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bilinear_combine \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- op, 2, v0, v1, v0, v18, v19, v9, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \op, 2, v0, v1, v0, v18, v19, v9, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- v3, v8, v10, v11, v5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_deinterleave_dst mask_fmt, op, 2, v0, v1, v0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_store_&dst_fmt 2, v16, v17
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_deinterleave_dst \mask_fmt, \op, 2, v0, v1, v0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_store_\()\dst_fmt 2, v16, v17
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interpolate_four_pixels src_fmt, mask_fmt, dst_fmt, op
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_and_vertical_interpolate_four_&src_fmt \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- v1, v11, v4, v5, v6, v7, v22, v23 \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_and_vertical_interpolate_four_\()\src_fmt \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ v1, v11, v4, v5, v6, v7, v22, v23, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- v3, v9, v16, v17, v20, v21, v18, v19
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- prfm PREFETCH_MODE, [TMP1, PF_OFFS]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub TMP1, TMP1, STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -580,23 +581,23 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- shrn2 v0.8h, v10.4s, #(2 * BILINEAR_INTERPOLATION_BITS)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- shrn v2.4h, v2.4s, #(2 * BILINEAR_INTERPOLATION_BITS)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- shrn2 v2.8h, v8.4s, #(2 * BILINEAR_INTERPOLATION_BITS)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_mask mask_fmt, 4, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_duplicate_mask mask_fmt, 4, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_mask \mask_fmt, 4, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_duplicate_mask \mask_fmt, 4, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ushr v15.8h, v12.8h, #(16 - BILINEAR_INTERPOLATION_BITS)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- xtn v0.8b, v0.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- xtn v1.8b, v2.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add v12.8h, v12.8h, v13.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_dst dst_fmt, op, 4, v2, v3, v21
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_dst \dst_fmt, \op, 4, v2, v3, v21
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bilinear_interleave_src_dst \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mask_fmt, op, 4, v0, v1, v0, v2, v3, v11
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \mask_fmt, \op, 4, v0, v1, v0, v2, v3, v11
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bilinear_apply_mask_to_src \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mask_fmt, 4, v0, v1, v0, v4, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \mask_fmt, 4, v0, v1, v0, v4, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- v6, v8, v9, v10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bilinear_combine \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- op, 4, v0, v1, v0, v2, v3, v1, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \op, 4, v0, v1, v0, v2, v3, v1, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- v6, v8, v9, v10, v23
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_deinterleave_dst mask_fmt, op, 4, v0, v1, v0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_store_&dst_fmt 4, v6, v7
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_deinterleave_dst \mask_fmt, \op, 4, v0, v1, v0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_store_\()\dst_fmt 4, v6, v7
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .set BILINEAR_FLAG_USE_MASK, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -636,14 +637,14 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- prefetch_distance, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- flags
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--pixman_asm_function fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if pixblock_size == 8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif pixblock_size == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+pixman_asm_function \fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \pixblock_size == 8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \pixblock_size == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .error unsupported pixblock size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if ((flags) & BILINEAR_FLAG_USE_MASK) == 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if ((\flags) & BILINEAR_FLAG_USE_MASK) == 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- OUT .req x0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- TOP .req x1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- BOTTOM .req x2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -699,7 +700,7 @@ pixman_asm_function fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- STRIDE .req x15
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- DUMMY .req x30
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set prefetch_offset, prefetch_distance
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set prefetch_offset, \prefetch_distance
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- stp x29, x30, [sp, -16]!
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov x29, sp
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -714,7 +715,7 @@ pixman_asm_function fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub sp, sp, 120
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov WTMP1, #prefetch_distance
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov WTMP1, #\prefetch_distance
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull PF_OFFS, WTMP1, UX
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub STRIDE, BOTTOM, TOP
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -735,11 +736,11 @@ pixman_asm_function fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* ensure good destination alignment */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- cmp WIDTH, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 100f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tst OUT, #(1 << dst_bpp_shift)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ tst OUT, #(1 << \dst_bpp_shift)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 100f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ushr v15.8h, v12.8h, #(16 - BILINEAR_INTERPOLATION_BITS)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add v12.8h, v12.8h, v13.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_process_last_pixel
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \bilinear_process_last_pixel
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub WIDTH, WIDTH, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 100:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add v13.8h, v13.8h, v13.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -748,50 +749,50 @@ pixman_asm_function fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- cmp WIDTH, #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 100f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tst OUT, #(1 << (dst_bpp_shift + 1))
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ tst OUT, #(1 << (\dst_bpp_shift + 1))
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 100f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_process_two_pixels
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \bilinear_process_two_pixels
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub WIDTH, WIDTH, #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 100:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if pixblock_size == 8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \pixblock_size == 8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- cmp WIDTH, #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 100f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tst OUT, #(1 << (dst_bpp_shift + 2))
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ tst OUT, #(1 << (\dst_bpp_shift + 2))
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 100f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_process_four_pixels
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \bilinear_process_four_pixels
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub WIDTH, WIDTH, #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 100:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- subs WIDTH, WIDTH, #pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ subs WIDTH, WIDTH, #\pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 100f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- asr PF_OFFS, PF_OFFS, #(16 - src_bpp_shift)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_process_pixblock_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- subs WIDTH, WIDTH, #pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ asr PF_OFFS, PF_OFFS, #(16 - \src_bpp_shift)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \bilinear_process_pixblock_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ subs WIDTH, WIDTH, #\pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 500f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 0:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- subs WIDTH, WIDTH, #pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \bilinear_process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ subs WIDTH, WIDTH, #\pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bge 0b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 500:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_process_pixblock_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \bilinear_process_pixblock_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 100:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if pixblock_size == 8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \pixblock_size == 8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- tst WIDTH, #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 200f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_process_four_pixels
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \bilinear_process_four_pixels
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 200:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* handle the remaining trailing pixels */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- tst WIDTH, #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 200f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_process_two_pixels
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \bilinear_process_two_pixels
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 200:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- tst WIDTH, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 300f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_process_last_pixel
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \bilinear_process_last_pixel
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 300:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if ((flags) & BILINEAR_FLAG_USE_MASK) == 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if ((\flags) & BILINEAR_FLAG_USE_MASK) == 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub x29, x29, 64
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ld1 {v8.8b, v9.8b, v10.8b, v11.8b}, [x29], 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ld1 {v12.8b, v13.8b, v14.8b, v15.8b}, [x29], 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -829,11 +830,11 @@ pixman_asm_function fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .unreq TMP3
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .unreq TMP4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .unreq STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if ((flags) & BILINEAR_FLAG_USE_MASK) != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if ((\flags) & BILINEAR_FLAG_USE_MASK) != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .unreq MASK
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.endfunc
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+pixman_end_asm_function
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>---- pixman/pixman-arma64-neon-asm.S
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+++ pixman/pixman-arma64-neon-asm.S
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -267,54 +267,54 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sli v4.8h, v4.8h, #5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ushll v14.8h, v17.8b, #7
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sli v14.8h, v14.8h, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ushll v8.8h, v19.8b, #7
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sli v8.8h, v8.8h, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF tst PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF tst, PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sri v6.8b, v6.8b, #5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mvn v3.8b, v3.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sri v7.8b, v7.8b, #6
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- shrn v30.8b, v4.8h, #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v10.8h, v3.8b, v6.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v11.8h, v3.8b, v7.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v12.8h, v3.8b, v30.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sri v14.8h, v8.8h, #5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF cmp PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF cmp, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ushll v9.8h, v18.8b, #7
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sli v9.8h, v9.8h, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v17.8h, v10.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v19.8h, v11.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v18.8h, v12.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF subs PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF subs, PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sri v14.8h, v9.8h, #11
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov v28.d[0], v14.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov v29.d[0], v14.d[1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v20.8b, v10.8h, v17.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v23.8b, v11.8h, v19.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_DST, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_DST, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v22.8b, v12.8h, v18.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v14.8h}, [DST_W], #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -474,32 +474,32 @@ generate_composite_function \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixman_composite_src_8888_0565_process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sri v14.8h, v8.8h, #5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF tst PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF tst, PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- fetch_src_pixblock
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sri v14.8h, v9.8h, #11
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov v28.d[0], v14.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov v29.d[0], v14.d[1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF cmp PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF cmp, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ushll v8.8h, v1.8b, #7
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sli v8.8h, v8.8h, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v14.8h}, [DST_W], #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF subs PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF subs, PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ushll v14.8h, v2.8b, #7
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sli v14.8h, v14.8h, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ushll v9.8h, v0.8b, #7
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sli v9.8h, v9.8h, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -566,31 +566,31 @@ generate_composite_function \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixman_composite_add_8_8_process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- fetch_src_pixblock
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF tst PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF tst, PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ld1 {v4.8b, v5.8b, v6.8b, v7.8b}, [DST_R], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v28.8b, v29.8b, v30.8b, v31.8b}, [DST_W], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF cmp PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF subs PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF cmp, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF subs, PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v28.8b, v0.8b, v4.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v29.8b, v1.8b, v5.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v30.8b, v2.8b, v6.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -612,31 +612,31 @@ generate_composite_function \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixman_composite_add_8888_8888_process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- fetch_src_pixblock
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF tst PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF tst, PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ld1 {v4.8b, v5.8b, v6.8b, v7.8b}, [DST_R], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v28.8b, v29.8b, v30.8b, v31.8b}, [DST_W], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF cmp PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF subs PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF cmp, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF subs, PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v28.8b, v0.8b, v4.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v29.8b, v1.8b, v5.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v30.8b, v2.8b, v6.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -689,45 +689,45 @@ generate_composite_function_single_scanline \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixman_composite_out_reverse_8888_8888_process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ld4 {v4.8b, v5.8b, v6.8b, v7.8b}, [DST_R], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v14.8h, v8.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF tst PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF tst, PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v15.8h, v9.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v16.8h, v10.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v17.8h, v11.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v28.8b, v14.8h, v8.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v29.8b, v15.8h, v9.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF cmp PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF cmp, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v30.8b, v16.8h, v10.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v31.8b, v17.8h, v11.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- fetch_src_pixblock
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mvn v22.8b, v3.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st4 {v28.8b, v29.8b, v30.8b, v31.8b}, [DST_W], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v8.8h, v22.8b, v4.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF subs PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF subs, PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v9.8h, v22.8b, v5.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v10.8h, v22.8b, v6.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v11.8h, v22.8b, v7.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -759,18 +759,18 @@ generate_composite_function_single_scanline \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixman_composite_over_8888_8888_process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ld4 {v4.8b, v5.8b, v6.8b, v7.8b}, [DST_R], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v14.8h, v8.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF tst PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF tst, PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v15.8h, v9.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v16.8h, v10.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v17.8h, v11.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v28.8b, v14.8h, v8.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v29.8b, v15.8h, v9.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF cmp PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF cmp, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v30.8b, v16.8h, v10.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v31.8b, v17.8h, v11.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v28.8b, v0.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -778,30 +778,30 @@ generate_composite_function_single_scanline \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v30.8b, v2.8b, v30.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v31.8b, v3.8b, v31.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- fetch_src_pixblock
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mvn v22.8b, v3.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st4 {v28.8b, v29.8b, v30.8b, v31.8b}, [DST_W], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v8.8h, v22.8b, v4.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF subs PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF subs, PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v9.8h, v22.8b, v5.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v10.8h, v22.8b, v6.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v11.8h, v22.8b, v7.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -865,30 +865,30 @@ generate_composite_function_single_scanline \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v31.8b, v17.8h, v11.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ld4 {v4.8b, v5.8b, v6.8b, v7.8b}, [DST_R], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v28.8b, v0.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF tst PF_CTL, #0x0F
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF tst, PF_CTL, #0x0F
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v29.8b, v1.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v30.8b, v2.8b, v30.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v31.8b, v3.8b, v31.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF cmp PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF cmp, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v8.8h, v24.8b, v4.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v9.8h, v24.8b, v5.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v10.8h, v24.8b, v6.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF subs PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF subs, PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v11.8h, v24.8b, v7.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st4 {v28.8b, v29.8b, v30.8b, v31.8b}, [DST_W], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -917,18 +917,18 @@ generate_composite_function \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixman_composite_over_reverse_n_8888_process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v14.8h, v8.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF tst PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF tst, PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v15.8h, v9.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v12.8h, v10.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v13.8h, v11.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v28.8b, v14.8h, v8.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v29.8b, v15.8h, v9.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF cmp PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF cmp, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v30.8b, v12.8h, v10.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v31.8b, v13.8h, v11.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v28.8b, v0.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -937,22 +937,22 @@ generate_composite_function \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v31.8b, v3.8b, v31.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ld4 {v0.8b, v1.8b, v2.8b, v3.8b}, [DST_R], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mvn v22.8b, v3.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st4 {v28.8b, v29.8b, v30.8b, v31.8b}, [DST_W], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF blt 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF blt, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v8.8h, v22.8b, v4.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF blt 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF subs PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF blt, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF subs, PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v9.8h, v22.8b, v5.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v10.8h, v22.8b, v6.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF blt 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF blt, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v11.8h, v22.8b, v7.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -1410,35 +1410,35 @@ generate_composite_function \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixman_composite_src_n_8_8888_process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- fetch_mask_pixblock
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- rshrn v28.8b, v8.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF tst PF_CTL, #0x0F
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF tst, PF_CTL, #0x0F
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- rshrn v29.8b, v9.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- rshrn v30.8b, v10.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- rshrn v31.8b, v11.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF cmp PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF cmp, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v8.8h, v24.8b, v0.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v9.8h, v24.8b, v1.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v10.8h, v24.8b, v2.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF subs PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF subs, PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v11.8h, v24.8b, v3.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, MASK_STRIDE, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_MASK, PF_MASK, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, MASK_STRIDE, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_MASK, PF_MASK, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st4 {v28.8b, v29.8b, v30.8b, v31.8b}, [DST_W], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ursra v8.8h, v8.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -1491,35 +1491,35 @@ generate_composite_function \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixman_composite_src_n_8_8_process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- fetch_mask_pixblock
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- rshrn v28.8b, v0.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF tst PF_CTL, #0x0F
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF tst, PF_CTL, #0x0F
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- rshrn v29.8b, v1.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- rshrn v30.8b, v2.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- rshrn v31.8b, v3.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF cmp PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF cmp, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v0.8h, v24.8b, v16.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v1.8h, v25.8b, v16.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v2.8h, v26.8b, v16.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF subs PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF subs, PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v3.8h, v27.8b, v16.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, MASK_STRIDE, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_MASK, PF_MASK, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, MASK_STRIDE, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_MASK, PF_MASK, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v28.8b, v29.8b, v30.8b, v31.8b}, [DST_W], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ursra v0.8h, v0.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -1599,44 +1599,44 @@ generate_composite_function \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v17.8h, v13.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- fetch_mask_pixblock
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v18.8h, v14.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v19.8h, v15.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF tst PF_CTL, #0x0F
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF tst, PF_CTL, #0x0F
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v28.8b, v16.8h, v12.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v29.8b, v17.8h, v13.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v30.8b, v18.8h, v14.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF cmp PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF cmp, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v31.8b, v19.8h, v15.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v16.8h, v24.8b, v8.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v17.8h, v24.8b, v9.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v18.8h, v24.8b, v10.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF subs PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF subs, PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v19.8h, v24.8b, v11.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v28.8b, v0.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, MASK_STRIDE, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_MASK, PF_MASK, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, MASK_STRIDE, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_MASK, PF_MASK, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v29.8b, v1.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- uqadd v30.8b, v2.8b, v30.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -2412,7 +2412,7 @@ generate_composite_function_single_scanline \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- default_cleanup_need_all_regs, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixman_composite_out_reverse_8888_n_8888_process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixman_composite_out_reverse_8888_n_8888_process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixman_composite_out_reverse_8888_8888_8888_process_pixblock_tail_head \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixman_composite_out_reverse_8888_8888_8888_process_pixblock_tail_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 28, /* dst_w_basereg */ \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 4, /* dst_r_basereg */ \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 0, /* src_basereg */ \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -2487,7 +2487,7 @@ generate_composite_function \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- default_cleanup_need_all_regs, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixman_composite_over_8888_n_8888_process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixman_composite_over_8888_n_8888_process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixman_composite_over_8888_8888_8888_process_pixblock_tail_head \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixman_composite_over_8888_8888_8888_process_pixblock_tail_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 28, /* dst_w_basereg */ \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 4, /* dst_r_basereg */ \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 0, /* src_basereg */ \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -2501,7 +2501,7 @@ generate_composite_function_single_scanline \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- default_cleanup_need_all_regs, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixman_composite_over_8888_n_8888_process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixman_composite_over_8888_n_8888_process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixman_composite_over_8888_8888_8888_process_pixblock_tail_head \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixman_composite_over_8888_8888_8888_process_pixblock_tail_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 28, /* dst_w_basereg */ \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 4, /* dst_r_basereg */ \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 0, /* src_basereg */ \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -2529,7 +2529,7 @@ generate_composite_function \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- default_cleanup_need_all_regs, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixman_composite_over_8888_n_8888_process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixman_composite_over_8888_n_8888_process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixman_composite_over_8888_8_8888_process_pixblock_tail_head \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixman_composite_over_8888_8_8888_process_pixblock_tail_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 28, /* dst_w_basereg */ \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 4, /* dst_r_basereg */ \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 0, /* src_basereg */ \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -2680,11 +2680,11 @@ generate_composite_function \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v13.8h, v10.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- fetch_src_pixblock
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v30.8b, v11.8h, v8.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF tst PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF tst, PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v29.8b, v12.8h, v9.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v28.8b, v13.8h, v10.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -2692,16 +2692,16 @@ generate_composite_function \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v9.8h, v3.8b, v1.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v10.8h, v3.8b, v2.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st4 {v28.8b, v29.8b, v30.8b, v31.8b}, [DST_W], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF cmp PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF subs PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF cmp, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF subs, PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -2749,11 +2749,11 @@ generate_composite_function \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- urshr v13.8h, v10.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- fetch_src_pixblock
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v28.8b, v11.8h, v8.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF tst PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF tst, PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v29.8b, v12.8h, v9.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- raddhn v30.8b, v13.8h, v10.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -2761,16 +2761,16 @@ generate_composite_function \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v9.8h, v3.8b, v1.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v10.8h, v3.8b, v2.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st4 {v28.8b, v29.8b, v30.8b, v31.8b}, [DST_W], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF cmp PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF subs PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF cmp, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF subs, PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 10f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 10:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -3131,53 +3131,53 @@ generate_composite_function_nearest_scanline \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr TMP1, X, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add X, X, UX
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add TMP1, TOP, TMP1, lsl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {®1&.2s}, [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {®2&.2s}, [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\reg1\().2s}, [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\reg2\().2s}, [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_load_0565 reg1, reg2, tmp
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr TMP1, X, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add X, X, UX
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add TMP1, TOP, TMP1, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {®2&.s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {®2&.s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- convert_four_0565_to_x888_packed reg2, reg1, reg2, tmp
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\reg2\().s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\reg2\().s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ convert_four_0565_to_x888_packed \reg2, \reg1, \reg2, \tmp
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_load_and_vertical_interpolate_two_8888 \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- acc1, acc2, reg1, reg2, reg3, reg4, tmp1, tmp2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_8888 reg1, reg2, tmp1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &acc1&.8h, ®1&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &acc1&.8h, ®2&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_8888 reg3, reg4, tmp2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &acc2&.8h, ®3&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &acc2&.8h, ®4&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_8888 \reg1, \reg2, \tmp1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\acc1\().8h, \()\reg1\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\acc1\().8h, \()\reg2\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_8888 \reg3, \reg4, \tmp2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\acc2\().8h, \()\reg3\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\acc2\().8h, \()\reg4\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_load_and_vertical_interpolate_four_8888 \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- xacc1, xacc2, xreg1, xreg2, xreg3, xreg4, xacc2lo, xacc2hi \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ xacc1, xacc2, xreg1, xreg2, xreg3, xreg4, xacc2lo, xacc2hi, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- yacc1, yacc2, yreg1, yreg2, yreg3, yreg4, yacc2lo, yacc2hi
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bilinear_load_and_vertical_interpolate_two_8888 \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- xacc1, xacc2, xreg1, xreg2, xreg3, xreg4, xacc2lo, xacc2hi
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \xacc1, \xacc2, \xreg1, \xreg2, \xreg3, \xreg4, \xacc2lo, \xacc2hi
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bilinear_load_and_vertical_interpolate_two_8888 \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- yacc1, yacc2, yreg1, yreg2, yreg3, yreg4, yacc2lo, yacc2hi
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \yacc1, \yacc2, \yreg1, \yreg2, \yreg3, \yreg4, \yacc2lo, \yacc2hi
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro vzip reg1, reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umov TMP4, v31.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- zip1 v31.8b, reg1, reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- zip2 reg2, reg1, reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov reg1, v31.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ zip1 v31.8b, \reg1, \reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ zip2 \reg2, \reg1, \reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \reg1, v31.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov v31.d[0], TMP4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro vuzp reg1, reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umov TMP4, v31.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- uzp1 v31.8b, reg1, reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- uzp2 reg2, reg1, reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov reg1, v31.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ uzp1 v31.8b, \reg1, \reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ uzp2 \reg2, \reg1, \reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \reg1, v31.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov v31.d[0], TMP4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -3189,23 +3189,23 @@ generate_composite_function_nearest_scanline \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr TMP2, X, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add X, X, UX
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add TMP2, TOP, TMP2, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&acc2&.s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&acc2&.s}[2], [TMP2], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&acc2&.s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&acc2&.s}[3], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- convert_0565_to_x888 acc2, reg3, reg2, reg1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip ®1&.8b, ®3&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip ®2&.8b, ®4&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip ®3&.8b, ®4&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip ®1&.8b, ®2&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &acc1&.8h, ®1&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &acc1&.8h, ®2&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &acc2&.8h, ®3&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &acc2&.8h, ®4&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\acc2\().s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\acc2\().s}[2], [TMP2], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\acc2\().s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\acc2\().s}[3], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ convert_0565_to_x888 \acc2, \reg3, \reg2, \reg1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\reg1\().8b, \()\reg3\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\reg2\().8b, \()\reg4\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\reg3\().8b, \()\reg4\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\reg1\().8b, \()\reg2\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\acc1\().8h, \()\reg1\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\acc1\().8h, \()\reg2\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\acc2\().8h, \()\reg3\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\acc2\().8h, \()\reg4\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_load_and_vertical_interpolate_four_0565 \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- xacc1, xacc2, xreg1, xreg2, xreg3, xreg4, xacc2lo, xacc2hi \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ xacc1, xacc2, xreg1, xreg2, xreg3, xreg4, xacc2lo, xacc2hi, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- yacc1, yacc2, yreg1, yreg2, yreg3, yreg4, yacc2lo, yacc2hi
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr TMP1, X, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add X, X, UX
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -3213,49 +3213,49 @@ generate_composite_function_nearest_scanline \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr TMP2, X, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add X, X, UX
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add TMP2, TOP, TMP2, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&xacc2&.s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&xacc2&.s}[2], [TMP2], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&xacc2&.s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&xacc2&.s}[3], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- convert_0565_to_x888 xacc2, xreg3, xreg2, xreg1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\xacc2\().s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\xacc2\().s}[2], [TMP2], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\xacc2\().s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\xacc2\().s}[3], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ convert_0565_to_x888 \xacc2, \xreg3, \xreg2, \xreg1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr TMP1, X, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add X, X, UX
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add TMP1, TOP, TMP1, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr TMP2, X, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add X, X, UX
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add TMP2, TOP, TMP2, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&yacc2&.s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &xreg1&.8b, &xreg3&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&yacc2&.s}[2], [TMP2], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &xreg2&.8b, &xreg4&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&yacc2&.s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &xreg3&.8b, &xreg4&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {&yacc2&.s}[3], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &xreg1&.8b, &xreg2&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- convert_0565_to_x888 yacc2, yreg3, yreg2, yreg1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &xacc1&.8h, &xreg1&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &yreg1&.8b, &yreg3&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &xacc1&.8h, &xreg2&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &yreg2&.8b, &yreg4&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &xacc2&.8h, &xreg3&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &yreg3&.8b, &yreg4&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &xacc2&.8h, &xreg4&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip &yreg1&.8b, &yreg2&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &yacc1&.8h, &yreg1&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &yacc1&.8h, &yreg2&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umull &yacc2&.8h, &yreg3&.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- umlal &yacc2&.8h, &yreg4&.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\yacc2\().s}[0], [TMP1], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\xreg1\().8b, \()\xreg3\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\yacc2\().s}[2], [TMP2], STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\xreg2\().8b, \()\xreg4\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\yacc2\().s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\xreg3\().8b, \()\xreg4\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {\()\yacc2\().s}[3], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\xreg1\().8b, \()\xreg2\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ convert_0565_to_x888 \yacc2, \yreg3, \yreg2, \yreg1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\xacc1\().8h, \()\xreg1\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\yreg1\().8b, \()\yreg3\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\xacc1\().8h, \()\xreg2\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\yreg2\().8b, \()\yreg4\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\xacc2\().8h, \()\xreg3\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\yreg3\().8b, \()\yreg4\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\xacc2\().8h, \()\xreg4\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip \()\yreg1\().8b, \()\yreg2\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\yacc1\().8h, \()\yreg1\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\yacc1\().8h, \()\yreg2\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umull \()\yacc2\().8h, \()\yreg3\().8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ umlal \()\yacc2\().8h, \()\yreg4\().8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_store_8888 numpix, tmp1, tmp2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if numpix == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \numpix == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v0.2s, v1.2s}, [OUT], #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v0.2s}, [OUT], #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v0.s}[0], [OUT], #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .error bilinear_store_8888 numpix is unsupported
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .error bilinear_store_8888 \numpix is unsupported
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -3264,20 +3264,20 @@ generate_composite_function_nearest_scanline \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- vuzp v2.8b, v3.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- vuzp v1.8b, v3.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- vuzp v0.8b, v2.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- convert_8888_to_0565 v2, v1, v0, v1, tmp1, tmp2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if numpix == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ convert_8888_to_0565 v2, v1, v0, v1, \tmp1, \tmp2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \numpix == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v1.4h}, [OUT], #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v1.s}[0], [OUT], #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v1.h}[0], [OUT], #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .error bilinear_store_0565 numpix is unsupported
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .error bilinear_store_0565 \numpix is unsupported
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interpolate_last_pixel src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_&src_fmt v0, v1, v2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_\()\src_fmt v0, v1, v2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umull v2.8h, v0.8b, v28.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umlal v2.8h, v1.8b, v29.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* 5 cycles bubble */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -3289,11 +3289,11 @@ generate_composite_function_nearest_scanline \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* 3 cycles bubble */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- xtn v0.8b, v0.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* 1 cycle bubble */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_store_&dst_fmt 1, v3, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_store_\()\dst_fmt 1, v3, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interpolate_two_pixels src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_and_vertical_interpolate_two_&src_fmt \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_and_vertical_interpolate_two_\()\src_fmt \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- v1, v11, v2, v3, v20, v21, v22, v23
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ushll v0.4s, v1.4h, #BILINEAR_INTERPOLATION_BITS
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umlsl v0.4s, v1.4h, v15.h[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -3306,12 +3306,12 @@ generate_composite_function_nearest_scanline \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ushr v15.8h, v12.8h, #(16 - BILINEAR_INTERPOLATION_BITS)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add v12.8h, v12.8h, v13.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- xtn v0.8b, v0.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_store_&dst_fmt 2, v3, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_store_\()\dst_fmt 2, v3, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interpolate_four_pixels src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_load_and_vertical_interpolate_four_&src_fmt \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- v1, v11, v14, v20, v16, v17, v22, v23 \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_load_and_vertical_interpolate_four_\()\src_fmt \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ v1, v11, v14, v20, v16, v17, v22, v23, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- v3, v9, v24, v25, v26, v27, v18, v19
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- prfm PREFETCH_MODE, [TMP1, PF_OFFS]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub TMP1, TMP1, STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -3338,54 +3338,54 @@ generate_composite_function_nearest_scanline \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- xtn v0.8b, v0.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- xtn v1.8b, v2.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add v12.8h, v12.8h, v13.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_store_&dst_fmt 4, v3, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_store_\()\dst_fmt 4, v3, v4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interpolate_four_pixels_head src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.ifdef have_bilinear_interpolate_four_pixels_&src_fmt&_&dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_four_pixels_&src_fmt&_&dst_fmt&_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.ifdef have_bilinear_interpolate_four_pixels_\()\src_fmt\()_\()\dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_four_pixels_\()\src_fmt\()_\()\dst_fmt\()_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_four_pixels src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_four_pixels \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interpolate_four_pixels_tail src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.ifdef have_bilinear_interpolate_four_pixels_&src_fmt&_&dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_four_pixels_&src_fmt&_&dst_fmt&_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.ifdef have_bilinear_interpolate_four_pixels_\()\src_fmt\()_\()\dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_four_pixels_\()\src_fmt\()_\()\dst_fmt\()_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interpolate_four_pixels_tail_head src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.ifdef have_bilinear_interpolate_four_pixels_&src_fmt&_&dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_four_pixels_&src_fmt&_&dst_fmt&_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.ifdef have_bilinear_interpolate_four_pixels_\()\src_fmt\()_\()\dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_four_pixels_\()\src_fmt\()_\()\dst_fmt\()_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_four_pixels src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_four_pixels \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interpolate_eight_pixels_head src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.ifdef have_bilinear_interpolate_eight_pixels_&src_fmt&_&dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_eight_pixels_&src_fmt&_&dst_fmt&_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.ifdef have_bilinear_interpolate_eight_pixels_\()\src_fmt\()_\()\dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_eight_pixels_\()\src_fmt\()_\()\dst_fmt\()_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_four_pixels_head src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_four_pixels_tail_head src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_four_pixels_head \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_four_pixels_tail_head \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interpolate_eight_pixels_tail src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.ifdef have_bilinear_interpolate_eight_pixels_&src_fmt&_&dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_eight_pixels_&src_fmt&_&dst_fmt&_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.ifdef have_bilinear_interpolate_eight_pixels_\()\src_fmt\()_\()\dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_eight_pixels_\()\src_fmt\()_\()\dst_fmt\()_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_four_pixels_tail src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_four_pixels_tail \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro bilinear_interpolate_eight_pixels_tail_head src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.ifdef have_bilinear_interpolate_eight_pixels_&src_fmt&_&dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_eight_pixels_&src_fmt&_&dst_fmt&_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.ifdef have_bilinear_interpolate_eight_pixels_\()\src_fmt\()_\()\dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_eight_pixels_\()\src_fmt\()_\()\dst_fmt\()_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_four_pixels_tail_head src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_four_pixels_tail_head src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_four_pixels_tail_head \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_four_pixels_tail_head \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -3410,7 +3410,7 @@ generate_composite_function_nearest_scanline \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- src_bpp_shift, dst_bpp_shift, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- prefetch_distance, flags
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--pixman_asm_function fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+pixman_asm_function \fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- OUT .req x0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- TOP .req x1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- BOTTOM .req x2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -3442,7 +3442,7 @@ pixman_asm_function fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- stp x10, x11, [x29, -96]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- stp x12, x13, [x29, -112]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov PF_OFFS, #prefetch_distance
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov PF_OFFS, #\prefetch_distance
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mul PF_OFFS, PF_OFFS, UX
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- subs STRIDE, BOTTOM, TOP
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -3463,11 +3463,11 @@ pixman_asm_function fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* ensure good destination alignment */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- cmp WIDTH, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 100f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tst OUT, #(1 << dst_bpp_shift)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ tst OUT, #(1 << \dst_bpp_shift)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 100f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ushr v15.8h, v12.8h, #(16 - BILINEAR_INTERPOLATION_BITS)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add v12.8h, v12.8h, v13.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_last_pixel src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_last_pixel \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub WIDTH, WIDTH, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 100:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add v13.8h, v13.8h, v13.8h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -3476,62 +3476,62 @@ pixman_asm_function fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- cmp WIDTH, #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 100f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tst OUT, #(1 << (dst_bpp_shift + 1))
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ tst OUT, #(1 << (\dst_bpp_shift + 1))
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 100f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_two_pixels src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_two_pixels \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub WIDTH, WIDTH, #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 100:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if ((flags) & BILINEAR_FLAG_UNROLL_8) != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if ((\flags) & BILINEAR_FLAG_UNROLL_8) != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /*********** 8 pixels per iteration *****************/
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- cmp WIDTH, #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 100f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tst OUT, #(1 << (dst_bpp_shift + 2))
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ tst OUT, #(1 << (\dst_bpp_shift + 2))
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 100f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_four_pixels src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_four_pixels \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub WIDTH, WIDTH, #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 100:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- subs WIDTH, WIDTH, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 100f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- asr PF_OFFS, PF_OFFS, #(16 - src_bpp_shift)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_eight_pixels_head src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ asr PF_OFFS, PF_OFFS, #(16 - \src_bpp_shift)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_eight_pixels_head \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- subs WIDTH, WIDTH, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 500f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 1000:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_eight_pixels_tail_head src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_eight_pixels_tail_head \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- subs WIDTH, WIDTH, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bge 1000b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 500:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_eight_pixels_tail src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_eight_pixels_tail \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 100:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- tst WIDTH, #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 200f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_four_pixels src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_four_pixels \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 200:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /*********** 4 pixels per iteration *****************/
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- subs WIDTH, WIDTH, #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 100f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- asr PF_OFFS, PF_OFFS, #(16 - src_bpp_shift)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_four_pixels_head src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ asr PF_OFFS, PF_OFFS, #(16 - \src_bpp_shift)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_four_pixels_head \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- subs WIDTH, WIDTH, #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 500f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 1000:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_four_pixels_tail_head src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_four_pixels_tail_head \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- subs WIDTH, WIDTH, #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bge 1000b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 500:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_four_pixels_tail src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_four_pixels_tail \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 100:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /****************************************************/
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* handle the remaining trailing pixels */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- tst WIDTH, #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 200f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_two_pixels src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_two_pixels \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 200:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- tst WIDTH, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 300f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bilinear_interpolate_last_pixel src_fmt, dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bilinear_interpolate_last_pixel \src_fmt, \dst_fmt
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 300:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub x29, x29, 64
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ld1 {v8.8b, v9.8b, v10.8b, v11.8b}, [x29], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -3556,7 +3556,7 @@ pixman_asm_function fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .unreq TMP3
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .unreq TMP4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .unreq STRIDE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.endfunc
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+pixman_end_asm_function
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>---- pixman/pixman-arma64-neon-asm.h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+++ pixman/pixman-arma64-neon-asm.h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -80,146 +80,146 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixldst1 op, elem_size, reg1, mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- op {v®1&.&elem_size}, [&mem_operand&], #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \op {v\()\reg1\().\()\elem_size}, [\()\mem_operand\()], #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixldst2 op, elem_size, reg1, reg2, mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- op {v®1&.&elem_size, v®2&.&elem_size}, [&mem_operand&], #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \op {v\()\reg1\().\()\elem_size, v\()\reg2\().\()\elem_size}, [\()\mem_operand\()], #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixldst4 op, elem_size, reg1, reg2, reg3, reg4, mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- op {v®1&.&elem_size, v®2&.&elem_size, v®3&.&elem_size, v®4&.&elem_size}, [&mem_operand&], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \op {v\()\reg1\().\()\elem_size, v\()\reg2\().\()\elem_size, v\()\reg3\().\()\elem_size, v\()\reg4\().\()\elem_size}, [\()\mem_operand\()], #32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixldst0 op, elem_size, reg1, idx, mem_operand, abits, bytes
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- op {v®1&.&elem_size}[idx], [&mem_operand&], #&bytes&
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \op {v\()\reg1\().\()\elem_size}[\idx], [\()\mem_operand\()], #\()\bytes\()
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixldst3 op, elem_size, reg1, reg2, reg3, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- op {v®1&.&elem_size, v®2&.&elem_size, v®3&.&elem_size}, [&mem_operand&], #24
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \op {v\()\reg1\().\()\elem_size, v\()\reg2\().\()\elem_size, v\()\reg3\().\()\elem_size}, [\()\mem_operand\()], #24
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixldst30 op, elem_size, reg1, reg2, reg3, idx, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- op {v®1&.&elem_size, v®2&.&elem_size, v®3&.&elem_size}[idx], [&mem_operand&], #3
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \op {v\()\reg1\().\()\elem_size, v\()\reg2\().\()\elem_size, v\()\reg3\().\()\elem_size}[\idx], [\()\mem_operand\()], #3
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixldst numbytes, op, elem_size, basereg, mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if numbytes == 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .if elem_size==32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst4 op, 2s, %(basereg+4), %(basereg+5), \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- %(basereg+6), %(basereg+7), mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .elseif elem_size==16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst4 op, 4h, %(basereg+4), %(basereg+5), \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- %(basereg+6), %(basereg+7), mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \numbytes == 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .if \elem_size==32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst4 \op, 2s, %(\basereg+4), %(\basereg+5), \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ %(\basereg+6), %(\basereg+7), \mem_operand, \abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .elseif \elem_size==16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst4 \op, 4h, %(\basereg+4), %(\basereg+5), \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ %(\basereg+6), %(\basereg+7), \mem_operand, \abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst4 op, 8b, %(basereg+4), %(basereg+5), \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- %(basereg+6), %(basereg+7), mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst4 \op, 8b, %(\basereg+4), %(\basereg+5), \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ %(\basereg+6), %(\basereg+7), \mem_operand, \abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numbytes == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .if elem_size==32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst2 op, 2s, %(basereg+2), %(basereg+3), mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .elseif elem_size==16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst2 op, 4h, %(basereg+2), %(basereg+3), mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numbytes == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .if \elem_size==32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst2 \op, 2s, %(\basereg+2), %(\basereg+3), \mem_operand, \abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .elseif \elem_size==16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst2 \op, 4h, %(\basereg+2), %(\basereg+3), \mem_operand, \abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst2 op, 8b, %(basereg+2), %(basereg+3), mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst2 \op, 8b, %(\basereg+2), %(\basereg+3), \mem_operand, \abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numbytes == 8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .if elem_size==32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst1 op, 2s, %(basereg+1), mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .elseif elem_size==16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst1 op, 4h, %(basereg+1), mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numbytes == 8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .if \elem_size==32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst1 \op, 2s, %(\basereg+1), \mem_operand, \abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .elseif \elem_size==16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst1 \op, 4h, %(\basereg+1), \mem_operand, \abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst1 op, 8b, %(basereg+1), mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst1 \op, 8b, %(\basereg+1), \mem_operand, \abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numbytes == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .if !RESPECT_STRICT_ALIGNMENT || (elem_size == 32)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst0 op, s, %(basereg+0), 1, mem_operand, abits, 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .elseif elem_size == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst0 op, h, %(basereg+0), 2, mem_operand, abits, 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst0 op, h, %(basereg+0), 3, mem_operand, abits, 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numbytes == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .if !RESPECT_STRICT_ALIGNMENT || (\elem_size == 32)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst0 \op, s, %(\basereg+0), 1, \mem_operand, \abits, 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .elseif \elem_size == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst0 \op, h, %(\basereg+0), 2, \mem_operand, \abits, 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst0 \op, h, %(\basereg+0), 3, \mem_operand, \abits, 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst0 op, b, %(basereg+0), 4, mem_operand, abits, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst0 op, b, %(basereg+0), 5, mem_operand, abits, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst0 op, b, %(basereg+0), 6, mem_operand, abits, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst0 op, b, %(basereg+0), 7, mem_operand, abits, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst0 \op, b, %(\basereg+0), 4, \mem_operand, \abits, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst0 \op, b, %(\basereg+0), 5, \mem_operand, \abits, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst0 \op, b, %(\basereg+0), 6, \mem_operand, \abits, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst0 \op, b, %(\basereg+0), 7, \mem_operand, \abits, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numbytes == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .if !RESPECT_STRICT_ALIGNMENT || (elem_size == 16)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst0 op, h, %(basereg+0), 1, mem_operand, abits, 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numbytes == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .if !RESPECT_STRICT_ALIGNMENT || (\elem_size == 16)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst0 \op, h, %(\basereg+0), 1, \mem_operand, \abits, 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst0 op, b, %(basereg+0), 2, mem_operand, abits, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst0 op, b, %(basereg+0), 3, mem_operand, abits, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst0 \op, b, %(\basereg+0), 2, \mem_operand, \abits, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst0 \op, b, %(\basereg+0), 3, \mem_operand, \abits, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numbytes == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst0 op, b, %(basereg+0), 1, mem_operand, abits, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numbytes == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst0 \op, b, %(\basereg+0), 1, \mem_operand, \abits, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .error "unsupported size: numbytes"
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .error "unsupported size: \numbytes"
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixld numpix, bpp, basereg, mem_operand, abits=0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if bpp > 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if (bpp == 32) && (numpix == 8) && (DEINTERLEAVE_32BPP_ENABLED != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst4 ld4, 8b, %(basereg+4), %(basereg+5), \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- %(basereg+6), %(basereg+7), mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif (bpp == 24) && (numpix == 8)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst3 ld3, 8b, %(basereg+3), %(basereg+4), %(basereg+5), mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif (bpp == 24) && (numpix == 4)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst30 ld3, b, %(basereg+0), %(basereg+1), %(basereg+2), 4, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst30 ld3, b, %(basereg+0), %(basereg+1), %(basereg+2), 5, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst30 ld3, b, %(basereg+0), %(basereg+1), %(basereg+2), 6, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst30 ld3, b, %(basereg+0), %(basereg+1), %(basereg+2), 7, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif (bpp == 24) && (numpix == 2)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst30 ld3, b, %(basereg+0), %(basereg+1), %(basereg+2), 2, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst30 ld3, b, %(basereg+0), %(basereg+1), %(basereg+2), 3, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif (bpp == 24) && (numpix == 1)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst30 ld3, b, %(basereg+0), %(basereg+1), %(basereg+2), 1, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \bpp > 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if (\bpp == 32) && (\numpix == 8) && (DEINTERLEAVE_32BPP_ENABLED != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst4 ld4, 8b, %(\basereg+4), %(\basereg+5), \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ %(\basereg+6), %(\basereg+7), \mem_operand, \abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif (\bpp == 24) && (\numpix == 8)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst3 ld3, 8b, %(\basereg+3), %(\basereg+4), %(\basereg+5), \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif (\bpp == 24) && (\numpix == 4)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst30 ld3, b, %(\basereg+0), %(\basereg+1), %(\basereg+2), 4, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst30 ld3, b, %(\basereg+0), %(\basereg+1), %(\basereg+2), 5, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst30 ld3, b, %(\basereg+0), %(\basereg+1), %(\basereg+2), 6, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst30 ld3, b, %(\basereg+0), %(\basereg+1), %(\basereg+2), 7, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif (\bpp == 24) && (\numpix == 2)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst30 ld3, b, %(\basereg+0), %(\basereg+1), %(\basereg+2), 2, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst30 ld3, b, %(\basereg+0), %(\basereg+1), %(\basereg+2), 3, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif (\bpp == 24) && (\numpix == 1)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst30 ld3, b, %(\basereg+0), %(\basereg+1), %(\basereg+2), 1, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst %(numpix * bpp / 8), ld1, %(bpp), basereg, mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst %(\numpix * \bpp / 8), ld1, %(\bpp), \basereg, \mem_operand, \abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixst numpix, bpp, basereg, mem_operand, abits=0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if bpp > 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if (bpp == 32) && (numpix == 8) && (DEINTERLEAVE_32BPP_ENABLED != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst4 st4, 8b, %(basereg+4), %(basereg+5), \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- %(basereg+6), %(basereg+7), mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif (bpp == 24) && (numpix == 8)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst3 st3, 8b, %(basereg+3), %(basereg+4), %(basereg+5), mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif (bpp == 24) && (numpix == 4)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst30 st3, b, %(basereg+0), %(basereg+1), %(basereg+2), 4, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst30 st3, b, %(basereg+0), %(basereg+1), %(basereg+2), 5, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst30 st3, b, %(basereg+0), %(basereg+1), %(basereg+2), 6, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst30 st3, b, %(basereg+0), %(basereg+1), %(basereg+2), 7, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif (bpp == 24) && (numpix == 2)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst30 st3, b, %(basereg+0), %(basereg+1), %(basereg+2), 2, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst30 st3, b, %(basereg+0), %(basereg+1), %(basereg+2), 3, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif (bpp == 24) && (numpix == 1)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst30 st3, b, %(basereg+0), %(basereg+1), %(basereg+2), 1, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix * bpp == 32 && abits == 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst 4, st1, 32, basereg, mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numpix * bpp == 16 && abits == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst 2, st1, 16, basereg, mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \bpp > 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if (\bpp == 32) && (\numpix == 8) && (DEINTERLEAVE_32BPP_ENABLED != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst4 st4, 8b, %(\basereg+4), %(\basereg+5), \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ %(\basereg+6), %(\basereg+7), \mem_operand, \abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif (\bpp == 24) && (\numpix == 8)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst3 st3, 8b, %(\basereg+3), %(\basereg+4), %(\basereg+5), \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif (\bpp == 24) && (\numpix == 4)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst30 st3, b, %(\basereg+0), %(\basereg+1), %(\basereg+2), 4, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst30 st3, b, %(\basereg+0), %(\basereg+1), %(\basereg+2), 5, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst30 st3, b, %(\basereg+0), %(\basereg+1), %(\basereg+2), 6, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst30 st3, b, %(\basereg+0), %(\basereg+1), %(\basereg+2), 7, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif (\bpp == 24) && (\numpix == 2)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst30 st3, b, %(\basereg+0), %(\basereg+1), %(\basereg+2), 2, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst30 st3, b, %(\basereg+0), %(\basereg+1), %(\basereg+2), 3, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif (\bpp == 24) && (\numpix == 1)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst30 st3, b, %(\basereg+0), %(\basereg+1), %(\basereg+2), 1, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix * \bpp == 32 && \abits == 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst 4, st1, 32, \basereg, \mem_operand, \abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numpix * \bpp == 16 && \abits == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst 2, st1, 16, \basereg, \mem_operand, \abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixldst %(numpix * bpp / 8), st1, %(bpp), basereg, mem_operand, abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixldst %(\numpix * \bpp / 8), st1, %(\bpp), \basereg, \mem_operand, \abits
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixld_a numpix, bpp, basereg, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if (bpp * numpix) <= 128
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld numpix, bpp, basereg, mem_operand, %(bpp * numpix)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if (\bpp * \numpix) <= 128
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld \numpix, \bpp, \basereg, \mem_operand, %(\bpp * \numpix)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld numpix, bpp, basereg, mem_operand, 128
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld \numpix, \bpp, \basereg, \mem_operand, 128
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixst_a numpix, bpp, basereg, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if (bpp * numpix) <= 128
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixst numpix, bpp, basereg, mem_operand, %(bpp * numpix)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if (\bpp * \numpix) <= 128
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixst \numpix, \bpp, \basereg, \mem_operand, %(\bpp * \numpix)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixst numpix, bpp, basereg, mem_operand, 128
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixst \numpix, \bpp, \basereg, \mem_operand, 128
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -228,96 +228,96 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- * aliases to be defined)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixld1_s elem_size, reg1, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if elem_size == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \elem_size == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr TMP1, VX, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- adds VX, VX, UNIT_X
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bmi 55f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 5: subs VX, VX, SRC_WIDTH_FIXED
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bpl 5b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 55:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- add TMP1, mem_operand, TMP1, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ add TMP1, \mem_operand, TMP1, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr TMP2, VX, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- adds VX, VX, UNIT_X
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bmi 55f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 5: subs VX, VX, SRC_WIDTH_FIXED
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bpl 5b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 55:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- add TMP2, mem_operand, TMP2, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {v®1&.h}[0], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ add TMP2, \mem_operand, TMP2, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {v\()\reg1\().h}[0], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr TMP1, VX, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- adds VX, VX, UNIT_X
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bmi 55f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 5: subs VX, VX, SRC_WIDTH_FIXED
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bpl 5b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 55:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- add TMP1, mem_operand, TMP1, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {v®1&.h}[1], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ add TMP1, \mem_operand, TMP1, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {v\()\reg1\().h}[1], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr TMP2, VX, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- adds VX, VX, UNIT_X
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bmi 55f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 5: subs VX, VX, SRC_WIDTH_FIXED
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bpl 5b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 55:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- add TMP2, mem_operand, TMP2, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {v®1&.h}[2], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {v®1&.h}[3], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif elem_size == 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ add TMP2, \mem_operand, TMP2, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {v\()\reg1\().h}[2], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {v\()\reg1\().h}[3], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \elem_size == 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr TMP1, VX, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- adds VX, VX, UNIT_X
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bmi 55f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 5: subs VX, VX, SRC_WIDTH_FIXED
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bpl 5b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 55:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- add TMP1, mem_operand, TMP1, lsl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ add TMP1, \mem_operand, TMP1, lsl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr TMP2, VX, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- adds VX, VX, UNIT_X
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bmi 55f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 5: subs VX, VX, SRC_WIDTH_FIXED
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bpl 5b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 55:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- add TMP2, mem_operand, TMP2, lsl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {v®1&.s}[0], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {v®1&.s}[1], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ add TMP2, \mem_operand, TMP2, lsl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {v\()\reg1\().s}[0], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {v\()\reg1\().s}[1], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .error "unsupported"
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixld2_s elem_size, reg1, reg2, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if 0 /* elem_size == 32 */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if 0 /* \elem_size == 32 */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov TMP1, VX, asr #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add VX, VX, UNIT_X, asl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- add TMP1, mem_operand, TMP1, asl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ add TMP1, \mem_operand, TMP1, asl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov TMP2, VX, asr #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub VX, VX, UNIT_X
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- add TMP2, mem_operand, TMP2, asl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {v®1&.s}[0], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ add TMP2, \mem_operand, TMP2, asl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {v\()\reg1\().s}[0], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov TMP1, VX, asr #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add VX, VX, UNIT_X, asl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- add TMP1, mem_operand, TMP1, asl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {v®2&.s}[0], [TMP2, :32]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ add TMP1, \mem_operand, TMP1, asl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {v\()\reg2\().s}[0], [TMP2, :32]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov TMP2, VX, asr #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- add VX, VX, UNIT_X
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- add TMP2, mem_operand, TMP2, asl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {v®1&.s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {v®2&.s}[1], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ add TMP2, \mem_operand, TMP2, asl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {v\()\reg1\().s}[1], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {v\()\reg2\().s}[1], [TMP2]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld1_s elem_size, reg1, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld1_s elem_size, reg2, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld1_s \elem_size, \reg1, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld1_s \elem_size, \reg2, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixld0_s elem_size, reg1, idx, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if elem_size == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \elem_size == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr TMP1, VX, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- adds VX, VX, UNIT_X
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bmi 55f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 5: subs VX, VX, SRC_WIDTH_FIXED
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bpl 5b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 55:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- add TMP1, mem_operand, TMP1, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {v®1&.h}[idx], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif elem_size == 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ add TMP1, \mem_operand, TMP1, lsl #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {v\()\reg1\().h}[\idx], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \elem_size == 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- asr DUMMY, VX, #16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov TMP1, DUMMY
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- adds VX, VX, UNIT_X
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -325,85 +325,85 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 5: subs VX, VX, SRC_WIDTH_FIXED
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bpl 5b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 55:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- add TMP1, mem_operand, TMP1, lsl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ld1 {v®1&.s}[idx], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ add TMP1, \mem_operand, TMP1, lsl #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ld1 {v\()\reg1\().s}[\idx], [TMP1]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixld_s_internal numbytes, elem_size, basereg, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if numbytes == 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld2_s elem_size, %(basereg+4), %(basereg+5), mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld2_s elem_size, %(basereg+6), %(basereg+7), mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixdeinterleave elem_size, %(basereg+4)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numbytes == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld2_s elem_size, %(basereg+2), %(basereg+3), mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numbytes == 8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld1_s elem_size, %(basereg+1), mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numbytes == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .if elem_size == 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld0_s elem_size, %(basereg+0), 1, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .elseif elem_size == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld0_s elem_size, %(basereg+0), 2, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld0_s elem_size, %(basereg+0), 3, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \numbytes == 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld2_s \elem_size, %(\basereg+4), %(\basereg+5), \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld2_s \elem_size, %(\basereg+6), %(\basereg+7), \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixdeinterleave \elem_size, %(\basereg+4)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numbytes == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld2_s \elem_size, %(\basereg+2), %(\basereg+3), \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numbytes == 8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld1_s \elem_size, %(\basereg+1), \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numbytes == 4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .if \elem_size == 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld0_s \elem_size, %(\basereg+0), 1, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .elseif \elem_size == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld0_s \elem_size, %(\basereg+0), 2, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld0_s \elem_size, %(\basereg+0), 3, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld0_s elem_size, %(basereg+0), 4, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld0_s elem_size, %(basereg+0), 5, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld0_s elem_size, %(basereg+0), 6, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld0_s elem_size, %(basereg+0), 7, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld0_s \elem_size, %(\basereg+0), 4, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld0_s \elem_size, %(\basereg+0), 5, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld0_s \elem_size, %(\basereg+0), 6, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld0_s \elem_size, %(\basereg+0), 7, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numbytes == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .if elem_size == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld0_s elem_size, %(basereg+0), 1, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numbytes == 2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .if \elem_size == 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld0_s \elem_size, %(\basereg+0), 1, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld0_s elem_size, %(basereg+0), 2, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld0_s elem_size, %(basereg+0), 3, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld0_s \elem_size, %(\basereg+0), 2, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld0_s \elem_size, %(\basereg+0), 3, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.elseif numbytes == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld0_s elem_size, %(basereg+0), 1, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.elseif \numbytes == 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld0_s \elem_size, %(\basereg+0), 1, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .error "unsupported size: numbytes"
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .error "unsupported size: \numbytes"
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixld_s numpix, bpp, basereg, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if bpp > 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld_s_internal %(numpix * bpp / 8), %(bpp), basereg, mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \bpp > 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld_s_internal %(\numpix * \bpp / 8), %(\bpp), \basereg, \mem_operand
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro vuzp8 reg1, reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umov DUMMY, v16.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- uzp1 v16.8b, v®1&.8b, v®2&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- uzp2 v®2&.8b, v®1&.8b, v®2&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov v®1&.8b, v16.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ uzp1 v16.8b, v\()\reg1\().8b, v\()\reg2\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ uzp2 v\()\reg2\().8b, v\()\reg1\().8b, v\()\reg2\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov v\()\reg1\().8b, v16.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov v16.d[0], DUMMY
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro vzip8 reg1, reg2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- umov DUMMY, v16.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- zip1 v16.8b, v®1&.8b, v®2&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- zip2 v®2&.8b, v®1&.8b, v®2&.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov v®1&.8b, v16.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ zip1 v16.8b, v\()\reg1\().8b, v\()\reg2\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ zip2 v\()\reg2\().8b, v\()\reg1\().8b, v\()\reg2\().8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov v\()\reg1\().8b, v16.8b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov v16.d[0], DUMMY
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* deinterleave B, G, R, A channels for eight 32bpp pixels in 4 registers */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixdeinterleave bpp, basereg
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if (bpp == 32) && (DEINTERLEAVE_32BPP_ENABLED != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vuzp8 %(basereg+0), %(basereg+1)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vuzp8 %(basereg+2), %(basereg+3)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vuzp8 %(basereg+1), %(basereg+3)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vuzp8 %(basereg+0), %(basereg+2)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if (\bpp == 32) && (DEINTERLEAVE_32BPP_ENABLED != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vuzp8 %(\basereg+0), %(\basereg+1)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vuzp8 %(\basereg+2), %(\basereg+3)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vuzp8 %(\basereg+1), %(\basereg+3)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vuzp8 %(\basereg+0), %(\basereg+2)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* interleave B, G, R, A channels for eight 32bpp pixels in 4 registers */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixinterleave bpp, basereg
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if (bpp == 32) && (DEINTERLEAVE_32BPP_ENABLED != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip8 %(basereg+0), %(basereg+2)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip8 %(basereg+1), %(basereg+3)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip8 %(basereg+2), %(basereg+3)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- vzip8 %(basereg+0), %(basereg+1)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if (\bpp == 32) && (DEINTERLEAVE_32BPP_ENABLED != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip8 %(\basereg+0), %(\basereg+2)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip8 %(\basereg+1), %(\basereg+3)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip8 %(\basereg+2), %(\basereg+3)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ vzip8 %(\basereg+0), %(\basereg+1)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -437,52 +437,52 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro PF a, x:vararg
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if (PREFETCH_TYPE_CURRENT == PREFETCH_TYPE_ADVANCED)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- a x
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \a \x
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro cache_preload std_increment, boost_increment
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if (src_bpp_shift >= 0) || (dst_r_bpp != 0) || (mask_bpp_shift >= 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if std_increment != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #std_increment
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \std_increment != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #\std_increment
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF tst PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF beq 71f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #boost_increment
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF tst, PF_CTL, #0xF
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF beq, 71f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #\boost_increment
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_CTL, PF_CTL, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 71:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF cmp PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF cmp, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if src_bpp_shift >= 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if dst_r_bpp != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if mask_bpp_shift >= 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, PF_X, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, PF_X, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 71f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF sub PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF subs PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 71f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF sub, PF_X, PF_X, ORIG_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF subs, PF_CTL, PF_CTL, #0x10
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 71:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ble 72f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ble, 72f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if src_bpp_shift >= 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_SRC, PF_SRC, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if dst_r_bpp != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_DST, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_DST, PF_DST, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if mask_bpp_shift >= 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, MASK_STRIDE, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF ldrsb DUMMY, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_MASK, PF_MASK, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, MASK_STRIDE, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF ldrsb, DUMMY, [PF_MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_MASK, PF_MASK, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 72:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -521,21 +521,21 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if src_bpp > 0 || mask_bpp > 0 || dst_r_bpp > 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .irp lowbit, 1, 2, 4, 8, 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if (dst_w_bpp <= (lowbit * 8)) && ((lowbit * 8) < (pixblock_size * dst_w_bpp))
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if lowbit < 16 /* we don't need more than 16-byte alignment */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tst DST_R, #lowbit
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if (dst_w_bpp <= (\lowbit * 8)) && ((\lowbit * 8) < (pixblock_size * dst_w_bpp))
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \lowbit < 16 /* we don't need more than 16-byte alignment */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ tst DST_R, #\lowbit
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 51f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld_src (lowbit * 8 / dst_w_bpp), src_bpp, src_basereg, SRC
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld (lowbit * 8 / dst_w_bpp), mask_bpp, mask_basereg, MASK
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld_src (\lowbit * 8 / dst_w_bpp), src_bpp, src_basereg, SRC
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld (\lowbit * 8 / dst_w_bpp), mask_bpp, mask_basereg, MASK
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if dst_r_bpp > 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld_a (lowbit * 8 / dst_r_bpp), dst_r_bpp, dst_r_basereg, DST_R
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld_a (\lowbit * 8 / dst_r_bpp), dst_r_bpp, dst_r_basereg, DST_R
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- add DST_R, DST_R, #lowbit
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ add DST_R, DST_R, #\lowbit
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #(lowbit * 8 / dst_w_bpp)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- sub W, W, #(lowbit * 8 / dst_w_bpp)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #(\lowbit * 8 / dst_w_bpp)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ sub W, W, #(\lowbit * 8 / dst_w_bpp)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 51:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endr
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -544,23 +544,23 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixdeinterleave mask_bpp, mask_basereg
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixdeinterleave dst_r_bpp, dst_r_basereg
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- cache_preload 0, pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- cache_preload_simple
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixinterleave dst_w_bpp, dst_w_basereg
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .irp lowbit, 1, 2, 4, 8, 16
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if (dst_w_bpp <= (lowbit * 8)) && ((lowbit * 8) < (pixblock_size * dst_w_bpp))
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if lowbit < 16 /* we don't need more than 16-byte alignment */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tst DST_W, #lowbit
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if (dst_w_bpp <= (\lowbit * 8)) && ((\lowbit * 8) < (pixblock_size * dst_w_bpp))
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \lowbit < 16 /* we don't need more than 16-byte alignment */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ tst DST_W, #\lowbit
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 51f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if src_bpp == 0 && mask_bpp == 0 && dst_r_bpp == 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- sub W, W, #(lowbit * 8 / dst_w_bpp)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ sub W, W, #(\lowbit * 8 / dst_w_bpp)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixst_a (lowbit * 8 / dst_w_bpp), dst_w_bpp, dst_w_basereg, DST_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixst_a (\lowbit * 8 / dst_w_bpp), dst_w_bpp, dst_w_basereg, DST_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 51:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endr
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -592,18 +592,18 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 52f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if src_bpp > 0 || mask_bpp > 0 || dst_r_bpp > 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .irp chunk_size, 16, 8, 4, 2, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if pixblock_size > chunk_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tst W, #chunk_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if pixblock_size > \chunk_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ tst W, #\chunk_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 51f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld_src chunk_size, src_bpp, src_basereg, SRC
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld chunk_size, mask_bpp, mask_basereg, MASK
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if dst_aligned_flag != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld_a chunk_size, dst_r_bpp, dst_r_basereg, DST_R
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld_src \chunk_size, src_bpp, src_basereg, SRC
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld \chunk_size, mask_bpp, mask_basereg, MASK
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \dst_aligned_flag != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld_a \chunk_size, dst_r_bpp, dst_r_basereg, DST_R
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld chunk_size, dst_r_bpp, dst_r_basereg, DST_R
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld \chunk_size, dst_r_bpp, dst_r_basereg, DST_R
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if cache_preload_flag != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #chunk_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \cache_preload_flag != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #\chunk_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 51:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -613,21 +613,21 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixdeinterleave mask_bpp, mask_basereg
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixdeinterleave dst_r_bpp, dst_r_basereg
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if cache_preload_flag != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \cache_preload_flag != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- cache_preload 0, pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- cache_preload_simple
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixinterleave dst_w_bpp, dst_w_basereg
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .irp chunk_size, 16, 8, 4, 2, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if pixblock_size > chunk_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tst W, #chunk_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if pixblock_size > \chunk_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ tst W, #\chunk_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- beq 51f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if dst_aligned_flag != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixst_a chunk_size, dst_w_bpp, dst_w_basereg, DST_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \dst_aligned_flag != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixst_a \chunk_size, dst_w_bpp, dst_w_basereg, DST_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixst chunk_size, dst_w_bpp, dst_w_basereg, DST_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixst \chunk_size, dst_w_bpp, dst_w_basereg, DST_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 51:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -660,7 +660,7 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- subs H, H, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov DST_R, DST_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- bge start_of_loop_label
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ bge \start_of_loop_label
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /*
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -687,7 +687,7 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- src_basereg_ = 0, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mask_basereg_ = 24
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixman_asm_function fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixman_asm_function \fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- stp x29, x30, [sp, -16]!
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov x29, sp
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub sp, sp, 232 /* push all registers */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -712,10 +712,10 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- * has to be used instead of ADVANCED.
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .set PREFETCH_TYPE_CURRENT, PREFETCH_TYPE_DEFAULT
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if prefetch_distance == 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \prefetch_distance == 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .set PREFETCH_TYPE_CURRENT, PREFETCH_TYPE_NONE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .elseif (PREFETCH_TYPE_CURRENT > PREFETCH_TYPE_SIMPLE) && \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ((src_bpp_ == 24) || (mask_bpp_ == 24) || (dst_w_bpp_ == 24))
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ((\src_bpp_ == 24) || (\mask_bpp_ == 24) || (\dst_w_bpp_ == 24))
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .set PREFETCH_TYPE_CURRENT, PREFETCH_TYPE_SIMPLE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -723,17 +723,17 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- * Make some macro arguments globally visible and accessible
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- * from other macros
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set src_bpp, src_bpp_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set mask_bpp, mask_bpp_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set dst_w_bpp, dst_w_bpp_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set pixblock_size, pixblock_size_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set dst_w_basereg, dst_w_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set dst_r_basereg, dst_r_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set src_basereg, src_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set mask_basereg, mask_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set src_bpp, \src_bpp_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set mask_bpp, \mask_bpp_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set dst_w_bpp, \dst_w_bpp_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set pixblock_size, \pixblock_size_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set dst_w_basereg, \dst_w_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set dst_r_basereg, \dst_r_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set src_basereg, \src_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set mask_basereg, \mask_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixld_src x:vararg
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld x
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld \x
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro fetch_src_pixblock
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixld_src pixblock_size, src_bpp, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -810,22 +810,22 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .error "requested dst bpp (dst_w_bpp) is not supported"
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if (((flags) & FLAG_DST_READWRITE) != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if (((\flags) & FLAG_DST_READWRITE) != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .set dst_r_bpp, dst_w_bpp
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .set dst_r_bpp, 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if (((flags) & FLAG_DEINTERLEAVE_32BPP) != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if (((\flags) & FLAG_DEINTERLEAVE_32BPP) != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .set DEINTERLEAVE_32BPP_ENABLED, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .set DEINTERLEAVE_32BPP_ENABLED, 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if prefetch_distance < 0 || prefetch_distance > 15
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .error "invalid prefetch distance (prefetch_distance)"
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \prefetch_distance < 0 || \prefetch_distance > 15
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .error "invalid prefetch distance (\prefetch_distance)"
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF mov PF_X, #0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF mov, PF_X, #0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov DST_R, DST_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if src_bpp == 24
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -844,15 +844,15 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /*
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- * Setup advanced prefetcher initial state
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF mov PF_SRC, SRC
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF mov PF_DST, DST_R
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF mov PF_MASK, MASK
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- /* PF_CTL = prefetch_distance | ((h - 1) << 4) */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, H, #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF mov PF_CTL, DUMMY
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_CTL, PF_CTL, #(prefetch_distance - 0x10)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- init
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF mov, PF_SRC, SRC
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF mov, PF_DST, DST_R
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF mov, PF_MASK, MASK
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ /* PF_CTL = \prefetch_distance | ((h - 1) << 4) */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, H, #4
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF mov, PF_CTL, DUMMY
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_CTL, PF_CTL, #(\prefetch_distance - 0x10)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \init
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- subs H, H, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov ORIG_W, W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 9f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -863,9 +863,9 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- * long scanlines
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 0:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ensure_destination_ptr_alignment process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ensure_destination_ptr_alignment \process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* Implement "head (tail_head) ... (tail_head) tail" loop pattern */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixld_a pixblock_size, dst_r_bpp, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -873,32 +873,32 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- fetch_src_pixblock
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixld pixblock_size, mask_bpp, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- (mask_basereg - pixblock_size * mask_bpp / 64), MASK
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF add PF_X, PF_X, #pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF add, PF_X, PF_X, #pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- cache_preload 0, pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- cache_preload_simple
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- subs W, W, #(pixblock_size * 2)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 200f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 100:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- cache_preload_simple
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- subs W, W, #pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bge 100b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 200:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixst_a pixblock_size, dst_w_bpp, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- (dst_w_basereg - pixblock_size * dst_w_bpp / 64), DST_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* Process the remaining trailing pixels in the scanline */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- process_trailing_pixels 1, 1, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- advance_to_next_scanline 0b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- cleanup
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \cleanup
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 1000:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* pop all registers */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub x29, x29, 64
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -925,16 +925,16 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 800:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if src_bpp_shift >= 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, SRC_STRIDE, #src_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [SRC, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if dst_r_bpp != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [DST_R, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, DST_STRIDE, #dst_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [DST_R, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .if mask_bpp_shift >= 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF lsl DUMMY, MASK_STRIDE, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- PF prfm PREFETCH_MODE, [MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF lsl, DUMMY, MASK_STRIDE, #mask_bpp_shift
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ PF prfm, PREFETCH_MODE, [MASK, DUMMY]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* Process exactly pixblock_size pixels if needed */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- tst W, #pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -944,19 +944,19 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- fetch_src_pixblock
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixld pixblock_size, mask_bpp, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- (mask_basereg - pixblock_size * mask_bpp / 64), MASK
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixst pixblock_size, dst_w_bpp, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- (dst_w_basereg - pixblock_size * dst_w_bpp / 64), DST_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 100:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* Process the remaining trailing pixels in the scanline */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- process_trailing_pixels 0, 0, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- advance_to_next_scanline 800b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 9:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- cleanup
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \cleanup
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* pop all registers */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub x29, x29, 64
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ld1 {v8.8b, v9.8b, v10.8b, v11.8b}, [x29], 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -995,7 +995,7 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .unreq PF_DST
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .unreq PF_MASK
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .unreq DUMMY
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .endfunc
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixman_end_asm_function
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /*
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -1019,23 +1019,23 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- src_basereg_ = 0, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mask_basereg_ = 24
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixman_asm_function fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixman_asm_function \fname
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .set PREFETCH_TYPE_CURRENT, PREFETCH_TYPE_NONE
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /*
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- * Make some macro arguments globally visible and accessible
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- * from other macros
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set src_bpp, src_bpp_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set mask_bpp, mask_bpp_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set dst_w_bpp, dst_w_bpp_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set pixblock_size, pixblock_size_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set dst_w_basereg, dst_w_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set dst_r_basereg, dst_r_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set src_basereg, src_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .set mask_basereg, mask_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if use_nearest_scaling != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set src_bpp, \src_bpp_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set mask_bpp, \mask_bpp_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set dst_w_bpp, \dst_w_bpp_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set pixblock_size, \pixblock_size_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set dst_w_basereg, \dst_w_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set dst_r_basereg, \dst_r_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set src_basereg, \src_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ .set mask_basereg, \mask_basereg_
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \use_nearest_scaling != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /*
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- * Assign symbolic names to registers for nearest scaling
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -1052,7 +1052,7 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- DUMMY .req x30
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixld_src x:vararg
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld_s x
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld_s \x
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sxtw x0, w0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -1080,7 +1080,7 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- DUMMY .req x30
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro pixld_src x:vararg
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- pixld x
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixld \x
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sxtw x0, w0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -1093,12 +1093,12 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- st1 {v12.8b, v13.8b, v14.8b, v15.8b}, [x29], 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if (((flags) & FLAG_DST_READWRITE) != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if (((\flags) & FLAG_DST_READWRITE) != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .set dst_r_bpp, dst_w_bpp
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .set dst_r_bpp, 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if (((flags) & FLAG_DEINTERLEAVE_32BPP) != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if (((\flags) & FLAG_DEINTERLEAVE_32BPP) != 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .set DEINTERLEAVE_32BPP_ENABLED, 1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .else
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .set DEINTERLEAVE_32BPP_ENABLED, 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -1109,15 +1109,15 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- (src_basereg - pixblock_size * src_bpp / 64), SRC
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- init
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \init
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- mov DST_R, DST_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- cmp W, #pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 800f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ensure_destination_ptr_alignment process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ensure_destination_ptr_alignment \process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- subs W, W, #pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 700f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -1128,26 +1128,26 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- fetch_src_pixblock
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixld pixblock_size, mask_bpp, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- (mask_basereg - pixblock_size * mask_bpp / 64), MASK
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- subs W, W, #pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- blt 200f
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 100:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- subs W, W, #pixblock_size
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- bge 100b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 200:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- pixst_a pixblock_size, dst_w_bpp, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- (dst_w_basereg - pixblock_size * dst_w_bpp / 64), DST_W
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 700:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* Process the remaining trailing pixels in the scanline (dst aligned) */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- process_trailing_pixels 0, 1, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- cleanup
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if use_nearest_scaling != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \cleanup
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \use_nearest_scaling != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub x29, x29, 64
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ld1 {v8.8b, v9.8b, v10.8b, v11.8b}, [x29], 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ld1 {v12.8b, v13.8b, v14.8b, v15.8b}, [x29], 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -1167,12 +1167,12 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- 800:
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* Process the remaining trailing pixels in the scanline (dst unaligned) */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- process_trailing_pixels 0, 0, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_head, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail, \
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \process_pixblock_tail_head
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- cleanup
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--.if use_nearest_scaling != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ \cleanup
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+.if \use_nearest_scaling != 0
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- sub x29, x29, 64
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ld1 {v8.8b, v9.8b, v10.8b, v11.8b}, [x29], 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- ld1 {v12.8b, v13.8b, v14.8b, v15.8b}, [x29], 32
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -1213,15 +1213,15 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .purgem fetch_src_pixblock
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .purgem pixld_src
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- .endfunc
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ pixman_end_asm_function
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro generate_composite_function_single_scanline x:vararg
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- generate_composite_function_scanline 0, x
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ generate_composite_function_scanline 0, \x
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro generate_composite_function_nearest_scanline x:vararg
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- generate_composite_function_scanline 1, x
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ generate_composite_function_scanline 1, \x
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /* Default prologue/epilogue, nothing special needs to be done */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -1255,22 +1255,22 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- * value (in) is lost.
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro convert_0565_to_8888 in, out_a, out_r, out_g, out_b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- shrn &out_r&.8b, &in&.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- shrn &out_g&.8b, &in&.8h, #3
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- sli &in&.8h, &in&.8h, #5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- movi &out_a&.8b, #255
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- sri &out_r&.8b, &out_r&.8b, #5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- sri &out_g&.8b, &out_g&.8b, #6
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- shrn &out_b&.8b, &in&.8h, #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ shrn \()\out_r\().8b, \()\in\().8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ shrn \()\out_g\().8b, \()\in\().8h, #3
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ sli \()\in\().8h, \()\in\().8h, #5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ movi \()\out_a\().8b, #255
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ sri \()\out_r\().8b, \()\out_r\().8b, #5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ sri \()\out_g\().8b, \()\out_g\().8b, #6
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ shrn \()\out_b\().8b, \()\in\().8h, #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro convert_0565_to_x888 in, out_r, out_g, out_b
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- shrn &out_r&.8b, &in&.8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- shrn &out_g&.8b, &in&.8h, #3
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- sli &in&.8h, &in&.8h, #5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- sri &out_r&.8b, &out_r&.8b, #5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- sri &out_g&.8b, &out_g&.8b, #6
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- shrn &out_b&.8b, &in&.8h, #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ shrn \()\out_r\().8b, \()\in\().8h, #8
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ shrn \()\out_g\().8b, \()\in\().8h, #3
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ sli \()\in\().8h, \()\in\().8h, #5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ sri \()\out_r\().8b, \()\out_r\().8b, #5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ sri \()\out_g\().8b, \()\out_g\().8b, #6
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ shrn \()\out_b\().8b, \()\in\().8h, #2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /*
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -1280,14 +1280,14 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- * registers (tmp1, tmp2)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro convert_8888_to_0565 in_r, in_g, in_b, out, tmp1, tmp2
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ushll &tmp1&.8h, &in_g&.8b, #7
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- shl &tmp1&.8h, &tmp1&.8h, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ushll &out&.8h, &in_r&.8b, #7
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- shl &out&.8h, &out&.8h, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ushll &tmp2&.8h, &in_b&.8b, #7
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- shl &tmp2&.8h, &tmp2&.8h, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- sri &out&.8h, &tmp1&.8h, #5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- sri &out&.8h, &tmp2&.8h, #11
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ushll \()\tmp1\().8h, \()\in_g\().8b, #7
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ shl \()\tmp1\().8h, \()\tmp1\().8h, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ushll \()\out\().8h, \()\in_r\().8b, #7
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ shl \()\out\().8h, \()\out\().8h, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ushll \()\tmp2\().8h, \()\in_b\().8b, #7
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ shl \()\tmp2\().8h, \()\tmp2\().8h, #1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ sri \()\out\().8h, \()\tmp1\().8h, #5
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ sri \()\out\().8h, \()\tmp2\().8h, #11
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- /*
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -1297,14 +1297,14 @@ local skip1
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- * value from 'in' is lost
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .macro convert_four_0565_to_x888_packed in, out0, out1, tmp
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- shl &out0&.4h, &in&.4h, #5 /* G top 6 bits */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- shl &tmp&.4h, &in&.4h, #11 /* B top 5 bits */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- sri &in&.4h, &in&.4h, #5 /* R is ready in top bits */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- sri &out0&.4h, &out0&.4h, #6 /* G is ready in top bits */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- sri &tmp&.4h, &tmp&.4h, #5 /* B is ready in top bits */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- ushr &out1&.4h, &in&.4h, #8 /* R is in place */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- sri &out0&.4h, &tmp&.4h, #8 /* G & B is in place */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- zip1 &tmp&.4h, &out0&.4h, &out1&.4h /* everything is in place */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- zip2 &out1&.4h, &out0&.4h, &out1&.4h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- mov &out0&.d[0], &tmp&.d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ shl \()\out0\().4h, \()\in\().4h, #5 /* G top 6 bits */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ shl \()\tmp\().4h, \()\in\().4h, #11 /* B top 5 bits */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ sri \()\in\().4h, \()\in\().4h, #5 /* R is ready \in top bits */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ sri \()\out0\().4h, \()\out0\().4h, #6 /* G is ready \in top bits */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ sri \()\tmp\().4h, \()\tmp\().4h, #5 /* B is ready \in top bits */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ ushr \()\out1\().4h, \()\in\().4h, #8 /* R is \in place */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ sri \()\out0\().4h, \()\tmp\().4h, #8 /* G \() B is \in place */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ zip1 \()\tmp\().4h, \()\out0\().4h, \()\out1\().4h /* everything is \in place */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ zip2 \()\out1\().4h, \()\out0\().4h, \()\out1\().4h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ mov \()\out0\().d[0], \()\tmp\().d[0]
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- .endm
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>---- test/utils.h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+++ test/utils.h
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -3,7 +3,7 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- #endif
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- #include <assert.h>
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>--#include "pixman-private.h" /* For 'inline' definition */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+#include "pixman-compiler.h" /* For 'inline' definition */
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- #include "utils-prng.h"
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- #if defined(_MSC_VER)
</span><span style='display:block; white-space:pre;color:#808080;'>diff --git a/graphics/libpixman/files/patch-pixman-pixman-vmx.c.diff b/graphics/libpixman/files/patch-pixman-pixman-vmx.c.diff
</span>deleted file mode 100644
<span style='display:block; white-space:pre;color:#808080;'>index 452005a52bd..00000000000
</span><span style='display:block; white-space:pre;background:#e0e0ff;'>--- a/graphics/libpixman/files/patch-pixman-pixman-vmx.c.diff
</span><span style='display:block; white-space:pre;background:#e0e0ff;'>+++ /dev/null
</span><span style='display:block; white-space:pre;background:#e0e0e0;'>@@ -1,14 +0,0 @@
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-https://bugs.freedesktop.org/show_bug.cgi?id=94769
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>---- pixman/pixman-vmx.c.orig
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+++ pixman/pixman-vmx.c
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-@@ -2933,10 +2933,7 @@ scaled_nearest_scanline_vmx_8888_8888_OVER (uint32_t* pd,
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- while (vx >= 0)
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- vx -= src_width_fixed;
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tmp[0] = tmp1;
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tmp[1] = tmp2;
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tmp[2] = tmp3;
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-- tmp[3] = tmp4;
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-+ tmp = (vector unsigned int){tmp1, tmp2, tmp3, tmp4};
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>-
</span><span style='display:block; white-space:pre;background:#ffe0e0;'>- vsrc = combine4 ((const uint32_t *) &tmp, pm);
</span></pre><pre style='margin:0'>
</pre>