Redirecting you to Reddit in 5 seconds:

https://old.reddit.com/r/singularity/comments/19dg7bl/spatialvlm_endowing_visionlanguage_models_with