The resample asm code as it is currently handles 1 sample at a time The asm code should be redesigned and handle more than 1 sample at a time.
Previously scale_cascaded() assumed the whole source frame arrived in a single sws_scale() call, and the dispatcher only routed full-frame calls to it.
Makes it a bit easier to add ops and uops in separate commits.
ff_yuv2rgb_get_func_ptr() now returns the C reference for explicit BE/LE 16bpp formats, not only the NE alias.
BE counterparts to the LE paths in 2e142e52ae; pack adds rev16 before store.
WS_EX_LAYERED allows input events to pass through to windows beneath.
And a64op_lr() helper for LR register.
Required to correctly present raw video.
This decomposes a swizzle mask into a series of optimal register-register moves, using at most two temporary scratch registers.