Given the information around how AMD are currently focusing on the mid tier, it got me thinking about their focus on multi chiplet approaches for RDNA5+, they will be having to do a lot of work to manage high speed interconnects and some form of internal scheduler/balancer for the chipets to split out the work etc.
So with this in mind if they could leverage that work on interconnectors and schedulers at a higher level to be a more cohesive form of Crossfire/SLI they wouldnt even need to release any high end cards, as they could just sell you multiple mid tier cards and you just daisy chain them together (within reason). It would allow them to sell multiple cards to individuals increasing sales numbers and also let them focus on less models so simpler/cheaper production costs.
Historically I think the issues with Crossfire/SLI was that to make best use of it the developers had to do a lot of legwork to spread out loads etc, but if they could somehow handle this at the lower levels like they do with the chiplets then maybe it could be abstracted away from the developers somewhat, i.e you designate a master/slave GPUs so the OS just treats the main one as a bigger one or something.
I doubt this is on the cards but it felt like something the was plausible and worth discussion.
I'd like to see oculink as the "GPU combiner". Allows for multiple external or internal GPUs, and easy expansion with power+GPU upgrades without having to start with a PSU that supports 1500w on your mobile APU processor, just because it supports 3 extra GPU expansion. Oculink doesn't stop many internal cards.
Daisy chaining GPUs in games could have final GPU do image upscaling, anti aliasing, ray tracing, frame gen.
There is a pretty easy programming model for GPU compute on "serial connection". If result of compute operation goes to render/monitor do it on the last (more powerful) GPU. If result will be processed further, do it on the first, or whatever available cores, GPU.
oculink (and USB 4) connections would let a current computer with oculink v1, connect to a a future GPU/NPU with 2-5 oculink v2 or v3 or USB5 outputs, and external storage as well. USB4 can be better than oculink because it can operate asymetrically. afaik, pci4x6 one way, do work, then send as pcie4x6 the other way. Thunderbolt/USB5 will be over double oculink and in symetric mode, equivalent to pcie4x9. USB4.2 is more than oculink = pcie4x5.
Using open standards is a good way to compete with Nvidia. Cheaper cables from volume. Much more applications/devices than getting 2 identical cards. Can have proprietary "unified virtual driver", but could also have a "unified opencl/vulkan" device that games program to, and can mix and match nvidia/amd connected GPUs.